Below is the samtools flagstat
output for my BAM file, which only contains mapped reads (as seen from 100.00%
mapped)
I know featureCounts
by default does not consider/assign multimappers (i.e., secondary). Does it also skip over singletons/unproperly paired reads? Is it correct to say they only consider primary, uniquely mapped, properly-paired reads for assignment?
In that case, is there any way to tell from the below output how many reads that is? There seems to be 9539764
/ 2 = 4769882
properly paired reads, but I'm assuming some of these are multimappers? So it would be less than that number of reads that featureCounts
attempts to assign a genomic feature?
20155191 + 0 in total (QC-passed reads + QC-failed reads)
9566830 + 0 primary
10588361 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
0 + 0 primary duplicates
20155191 + 0 mapped (100.00% : N/A)
9566830 + 0 primary mapped (100.00% : N/A)
9566830 + 0 paired in sequencing
4788364 + 0 read1
4778466 + 0 read2
9539764 + 0 properly paired (99.72% : N/A)
9539764 + 0 with itself and mate mapped
27066 + 0 singletons (0.28% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
Assuming you are using new version of
featureCounts
did you specify-p --countReadPairs
in your command line?I'm using v.1.5.2, which I think doesn't have the new options, so my command looks something like:
note how featurecounts operates and counts reads over various intervals, that typically cover a subset of the genome. Hence you would expect to match fewer reads than what is in a BAM file.