Question

featureCounts generates low assigned rate

0

Entering edit mode

7.5 years ago

yuabrahamliu ▴ 60

Hi all,

I'm processing 2 RNA-seq data. One has a fragment insertion size less than 30 bp, the other less than 500 bp. The mapping result from STAR is good, both of them have more than 95% reads with a MAPQ value > 10. However, what made me confuse is that, when I used featureCounts to count the reads to mouse mm10 GTF file, gencode.vM15.annotation.gtf. Only a small amount of reads can be successfully assigned. The 30 bp library had a assigned rate of only 27%, and the 500bp has a rate of 29%. So many reads were lost! I don't know if it is normal. Or, I have done something wrong? My command line is like this,

featureCounts -T 20 -t exon -g gene_name -a gencode.vM15.annotation.gtf -o counts.txt sample.bam

Could anyone give some suggestions? Thank you so much.

RNA-Seq • 3.2k views

ADD COMMENT • link updated 7.5 years ago by swbarnes2 15k • written 7.5 years ago by yuabrahamliu ▴ 60

0

Entering edit mode

Not sure about others but MAPQ > 10 is not a great cut-off for MAPQ. If STAR's interpretation of MAPQ is what MAPQ is supposed to be (Phred-scaled), then 10 equates to a 1-in-10 chance of misalignment.

What were the other STAR statistics? You can paste the entire report here. Uniquely aligned reads would be interesting.

Crunch question: did you also align to MM10?

ADD REPLY • link 7.5 years ago by Kevin Blighe 89k

0

Entering edit mode

Check the % of aligned reads with MAPQ>20.

ADD REPLY • link 7.5 years ago by Arup Ghosh 3.4k

score 0 · Answer 1 · 2018-01-30

0

Entering edit mode

7.5 years ago

swbarnes2 15k

The simple, but annoying, advice is to go back to where you got your genome and your gtf, and make sure they came from the sample place. If they didn't, get fresh ones from, say, ensembl, and redo the alignment.

ADD COMMENT • link 7.5 years ago by swbarnes2 15k