Rrna Removal In Rna-Seq Data
2
0
Entering edit mode
11.6 years ago

Hi,

I want to know how many reads are coming from rRNA in my data. My librairies are done with illumina TruSeq (Stranded) and RiboZero. So here my idea.

In UCSC Tables :

Select "All Tables" from the group drop-down list Select the "rmsk" table from the table drop-down list Choose "GTF" as the output format Type a filename in "output file" so your browser downloads the result Click "create" next to filter Next to "repClass," type rRNA Next to free-form query, select "OR" and type repClass = "tRNA" Click submit on that page, then get output on the main page

Now I've a gtf file with the rRNA and the tRNA

After that, I use htseq-count to extract the number of reads per rRNA and tRNA gene.

Is that ok ?

Thanks,

N.

rrna • 5.6k views
ADD COMMENT
0
Entering edit mode

Looks OK to me without checking the details.

ADD REPLY
0
Entering edit mode
11.6 years ago

You can also download RepBase (database of repetitive elements) and Rfam (database for different RNA species, http://rfam.sanger.ac.uk/) and use it as a filter database.

ADD COMMENT
0
Entering edit mode
11.6 years ago
Ryan Dale 5.0k

Your post-alignment filtering strategy should work. Another strategy is to do a separate alignment of unaligned reads to rRNA sequence. Since rRNA genes tend to be duplicated in eukaryotes, it's possible that highly multi-mapping reads are discarded (depending on the aligner and parameters you use) such that those reads don't make it into the final alignments you would use with htseq-count.

Also, if your goal is to remove rRNA reads from downstream analysis, and you use the post-alignment filtering strategy, you may want to go back and remove other alignments of that read that multi-mapped to non-rRNA regions.

I don't have a feel for how different these strategies are though -- it's possible they get you roughly the same answer in the end.

ADD COMMENT

Login before adding your answer.

Traffic: 948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6