Question

FeatureCounts for miRNAseq

1

Entering edit mode

5.5 years ago

Muc09 ▴ 20

I've looking for and answer on similar questions but i don't find what i suppose it's the problem. First I have 9 files of smallRNA-seq reads from human. I have aligned them with Bowtie2. I got a .sam file for each sample. Now I am counting with featureCounts. The results show 0% assigned reads for all the files. For alignment i use Ensemble human genome (GRCh38) My .gff3 file is from mirBase v22 (hsa.gff3) I used FeatureCounts with the following command:

featureCounts - t miRNA -g 'Name' -a /path/to/hsa.gff3 -o /path_to_all/*.sam

Similar Output for 9 samples:
|| Process SAM file sample_name.sam...
|| Single-end reads are included.
|| Total alignments : 58350348
|| Successfully assigned alignments : 26377 (0.0%)
|| Running time : 4.79 minutes

RNA-Seq software error next-gen • 3.5k views

ADD COMMENT • link 5.5 years ago by Muc09 ▴ 20

1

Entering edit mode

Im guessing your question is, why do you have ~58m reads aligned but on ~26k reads assign to features?

The answer is probably because your gff file combined with your sequencing strategy. Try using a more "generic" gff file to see what's been sequenced/aligned.

But another issue i see is that you're using Bowtie2, which is not splice-aware. I'm not sure about smallRNAs but you're better off using STAR or another splice-aware aligner.

ADD REPLY • link 5.5 years ago by Mark ★ 1.6k

1

Entering edit mode

Actually for microRNA's you want to align without gaps since you are looking at small reads. Typically they will be 20-30 bp. So using bowtie v.1.x would be a better choice as an aligner.

ADD REPLY • link 5.5 years ago by GenoMax 151k

0

Entering edit mode

Thanks for the correction. What's the difference between micro and small RNA? or are they different terms for the same thing.

ADD REPLY • link 5.5 years ago by Mark ★ 1.6k

1

Entering edit mode

Small RNA are a superset of all (<200 nt) where as miRNA are much smaller (~22 nt) and thus require un-gapped alignment to detect.

ADD REPLY • link 5.5 years ago by GenoMax 151k

1

Entering edit mode

Ok, thanks for the advice. I'll try using the Homo_sapiens.GRCh38.98.gtf to check more in detail. Also, the main reason for choosing Bowtie2 came for this publication: doi/10.1261/rna.055509.115. I also try different tools and aligners and got better results with Bowtie2 as the publication suggest.

ADD REPLY • link 5.5 years ago by Muc09 ▴ 20

1

Entering edit mode

Couple of things to check in this situation:

What % of reads are uniquely mapped? Remember that featureCounts doesn't count things that are secondary mappings or multimappings. Is your sequencing paired or single, because bowtie has issues when you do paired end and the two ends overlap too much, which would almost certainly always be the case with miRNA-seq.

Finally, its possible that your reads are running over the end of the miRNA annotation, in which case featureCounts will ignore them. I think there is a setting to tell it not to do this.

ADD REPLY • link 5.5 years ago by i.sudbery 21k

0

Entering edit mode

My library is single-end. I already try the advice from Amar ("generic gff file") and now i figured out that my reads got lots of information of snoRNA, so that's the main reason of low % when i use hsa.gff3. Anyway, thank you for your answers.

ADD REPLY • link 5.5 years ago by Muc09 ▴ 20

0

Entering edit mode

Its completely normal for a large fraction of your reads to be snoRNA or snRNA or other categories of small RNA, but I still wouldn't expect the amount mapping to miRNA to be that small.

Did you trim the reads before mapping? What was the post trimming size distribution (as measured by fastqc)?

ADD REPLY • link 5.5 years ago by i.sudbery 21k