Ribosomal Intervals For Collectrnaseqmetrics
2
4
Entering edit mode
11.7 years ago
Johan ▴ 890

Does anyone know if there is some publicly available ribosomal interval list for use with CollectRnaSeqMetrics in the Picard suite? Specifically I'm looking for a interval list for the Human genome in build g1k_v37. I found a description of how to generate my own in the faq but for obvious reasons it would be nice if there is one ready to use somewhere out there. Googleing hasn't turned anything up for me this far, so I thought I'd turn to the community to ask.

rna-seq qc picard • 13k views
ADD COMMENT
0
Entering edit mode

Is there one for hg38?

ADD REPLY
3
Entering edit mode
11.6 years ago
Dan D 7.4k

Hi Johan,

I was researching this exact question today and came across this post by Alec Wysoker, the author of Picard tools:

http://sourceforge.net/mailarchive/message.php?msg_id=27560147

I would just post his list here but it exceeds the character limit. I had to take out the @SQ lines referencing "GL" locations (the lines after 26), but other than that it worked great for me.

EDIT: You need to make sure your chromosome names in the @SQ lines match the first column in your interval list. I can give more detail if you don't already know what I'm talking about.

EDIT 2: Also ran across this site while searching for mouse:

http://dldcc-web.brc.bcm.edu/lilab/liguow/CGI/rseqc/_build/html/index.html#download-ribosome-rna-update-to-08-17-2012

ADD COMMENT
0
Entering edit mode

I don't know if that list is complete if you're using all the contigs. We get a lot of rRNA reads aligning to contig GL000220.1

ADD REPLY
2
Entering edit mode
10.0 years ago
Kamil ★ 2.3k

You can see my ribosomal intervals file and a simple script I used to create it here:

ADD COMMENT
1
Entering edit mode

Thank you for sharing Kamil. I made a couple of changes to the script use other species and rRNA can be in bed format:

Of course, you have been acknowledged. I hope you don't mind :)

ADD REPLY
0
Entering edit mode

Using this make_rRNA.sh script, I was able to generate the interval list very easily, however the interval list generated has missing headers, (for example lines starting with @SQ). Any pointers what might be the possible issue here?

ADD REPLY
0
Entering edit mode

imaparna27 Seems that there is something wrong with your line perl -lane 'print "\@SQ\tSN:$F[0]\tLN:$F[1]\tAS:$ENV{'genome'}"' $chrom_sizes | \ grep -v _ \

>> $rRNA

On the other hand, I have run this for GRCh38.p13 and its important to use gene_type and not gene_biotype... and big question here:

is it transcripts or gene the type we must select? the 2 scripts differ on that..

and also, if I select transcripts I get very low amount of transcrips... is that right?

ADD REPLY
0
Entering edit mode

its important to use gene_type and not gene_biotype

Why'd you say this?

ADD REPLY

Login before adding your answer.

Traffic: 2014 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6