samtools flagstat results:
some number + 0 in total (QC-passed reads + QC-failed reads)
my this "some number" does not match actual number of reads in my paired fastq file. Is this expected?
samtools flagstat results:
some number + 0 in total (QC-passed reads + QC-failed reads)
my this "some number" does not match actual number of reads in my paired fastq file. Is this expected?
The number of reads reported is the number actually in the file. If your aligner produced secondary alignments then this will often be higher than the original number in the fastq files.
My 2p: It's good to keep in mind that SAM/BAM stores alignments not reads (in fact, sam stands for sequence alignment/map) even if unmapped reads can be present.
For this reason the simple question "how many reads have been aligned?" can be tricky to answer. A simple strategy is to count the reads that have not been aligned and get the difference with the raw read count from fastq. But if you want to know how many reads have been aligned with certain criteria (e.g. mapq > x, alignment score > y, properly paired etc) than you should consider also split reads.
I suspect the use of the word "read" in samtools flagstat causes a lot of misunderstandings in this respect.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Coming back to my post after quite sometime
This command should give the exact number of reads in your sample:
Explanation
samtools view -f 4 your.bam -c
(-f
: accept the reads with the flag4
, this flag is forunmapped reads
)samtools view -F 2308 your.bam -c
IMAGE COURTESY : Explain SAM FLAGS