Hello,
I am using featureCount to complete a gene level count of my RNAseq reads. From the results summary, it looks like I have a very high number multimapping reads, resulting in a low assingment rate (less than 15%).
I have paired end reads, with an unstranded library
The options I have chosen are as follows:
featureCounts -T 8 -p -a ~/file/path/sacCer3.ncbiRefSeq.gtf -t exon -g gene_id -o ~/file/path/fcresults/groseq2.txt UWN.bam UMN.bam
after processing the reads, this is what my summary looks like:
Status UWN.bam UMN.bam
Assigned 7608607 10093518
Unassigned_Unmapped 4743940 3388412
Unassigned_Read_Type 0 0
Unassigned_Singleton 0 0
Unassigned_MappingQuality 0 0
Unassigned_Chimera 0 0
Unassigned_FragmentLength 0 0
Unassigned_Duplicate 0 0
Unassigned_MultiMapping 66981505 58121996
Unassigned_Secondary 0 0
Unassigned_NonSplit 0 0
Unassigned_NoFeatures 1927159 2654380
Unassigned_Overlapping_Length 0 0
Unassigned_Ambiguity 10983810 8513544
From what I understand, this means I have reads (fragments) that are mapping to more than one feature.
My question is, if the SAM/BAM file I created was able to successfully map fragments to a reference genome, how is it possible that these fragments are assigned to more than one gene (meta feature) in my GTF file? Do I need to use a different annotation file, or is this a typical assignment rate for the count I am after?
Any help would appreciated! Thanks,
Hey - thanks for replying.
Yes, I used Hisat2 to map my reads, and got a good alignment rate (also ~70%). This is why I was so confused to get a poor assignment rate. I was wondering if anybody had come across discrepancies of this kind.
Thanks for your help.
I don't see a discrepancy. Your data is telling you that most of your reads are mapping to multiple places. I'd investigate what exactly those reads are all aligning to.
Thanks for your help. Can you tell me how I can parse which reads are aligning or mapping to multiple places? From what I've read there are NH tags in the .bam files, but I'm not sure how I can separate these, or see which features they are mapping to. Thanks