Hello,
I have a question regarding the read count. After the sequencing, i found that i have 513k reads that were passed. But when i align (minimap2) and check how much reads i have in my bam file using Samtools, i found myself with 1.7M reads. After that i was looking for the mapped reads only and i have 280k. Than when i use featureCount to quantify and count my reads, it tells me that i have 1.7M reads, but assign only 268k reads. Can you help me please ?
Another thing : I was also wondering how to quantify rRNAs in my sequencing using featureCount. I tried to retrieve all rRNAs feature in the count file generated by featureCount, but the number of reads aligning to the rRNAs is greater than the reads assigned by featureCount.
Thank you very much for your help!!
You likely have secondary/supplementary alignments since you have long read data. So that is likely reason for the number discrepancy.
Can you clarify what are you mapping against? Whole genome?
Yes the whole genome
You are sure your genome has the rDNA repeat in it? This is one of the part of genome that is not completely placed since individuals have between 300-500 copies on different chromosomes.
While it is not recommended to use a reduced reference, if you want to identify reads that map to human rDNA repeat you could use this sequence: how can i download human ribosomal reference ?
Any reason you are not using pipelines like this one from ONT?