featureCounts - Low Assigned rate - Locations of reads
0
0
Entering edit mode
3.8 years ago
chrys ▴ 80

Well hello there,

I am using featureCounts from the subread package to count some third generation reads produced by Nanopore sequencing (MinION) and mapped to a reference genome. While we had overall high basecall quality for our reads and the mapping rates were also very nice (94%) featureCount only produced assignment rates in the 50% to 60%.

The largest group there is "NoFeatures" which made me wonder where those reads mapped.

Assigned    1057725  
Unassigned_Unmapped 62207  
Unassigned_Read_Type    0  
Unassigned_Singleton    0  
Unassigned_MappingQuality   0  
Unassigned_Chimera  0  
Unassigned_FragmentLength   0  
Unassigned_Duplicate    0  
Unassigned_MultiMapping 0  
Unassigned_Secondary    0  
Unassigned_NonSplit 0  
Unassigned_NoFeatures   457608  
Unassigned_Overlapping_Length   0  
Unassigned_Ambiguity    283748 

I used a custom annotation gff (Gencode + Custom features) to count the mappings. I was wondering if somebody knew a tool or straight forward way (other then checking IGV visually), where those reads are.

Especially if we possibly have some kind of contamination by genomic DNA.

Any suggestions for QC / Tools / procedures are welcome. Thanks !

RNA-Seq featureCount QC • 2.5k views
ADD COMMENT
2
Entering edit mode

If it is mapped but not overlapping the GTF then it is introns or intergenic. You can make a custom SAF file for featureCounts (see manual) to count the reads for these features. Intergenic is the complement of the entire genome with the GTF entries of type="gene" and intron is the entire genome minus intergenic and exon.

ADD REPLY
2
Entering edit mode

You can also use the qualimap rnaseq tool to count the number/percentage of exonic, intergenic or intronic regions: http://qualimap.conesalab.org/doc_html/analysis.html#rna-seq-qc.

I believe that you only need the bam and the GTF files (if I remember it well). Although you've a GFF file, you could convert this to GTF by using gffread: https://github.com/gpertea/gffread

ADD REPLY
0
Entering edit mode

Thanks to you both !

Qualimap was an excellent suggestions. Exactly what I am looking for. GFF to GTF conversion should be also no problem.

I found it puzzling that with ultra-long reads one would get so many unassigned counts. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 1879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6