Hi all,
I'm processing 2 RNA-seq data. One has a fragment insertion size less than 30 bp, the other less than 500 bp. The mapping result from STAR is good, both of them have more than 95% reads with a MAPQ value > 10. However, what made me confuse is that, when I used featureCounts to count the reads to mouse mm10 GTF file, gencode.vM15.annotation.gtf. Only a small amount of reads can be successfully assigned. The 30 bp library had a assigned rate of only 27%, and the 500bp has a rate of 29%. So many reads were lost! I don't know if it is normal. Or, I have done something wrong? My command line is like this,
featureCounts -T 20 -t exon -g gene_name -a gencode.vM15.annotation.gtf -o counts.txt sample.bam
Could anyone give some suggestions? Thank you so much.
Not sure about others but MAPQ > 10 is not a great cut-off for MAPQ. If STAR's interpretation of MAPQ is what MAPQ is supposed to be (Phred-scaled), then 10 equates to a 1-in-10 chance of misalignment.
What were the other STAR statistics? You can paste the entire report here. Uniquely aligned reads would be interesting.
Crunch question: did you also align to MM10?
Check the % of aligned reads with MAPQ>20.