Annotation For Retrotransposons
5
2
Entering edit mode
12.1 years ago

I’m searching for a downloadable annotation file for different types of ncRNA: in particular, rRNAs and retrotransposons for mouse. Most ncRNAs are not a problem – for instance, Ensembl provides a FASTA download of ncRNAs. Luckily, the actual sequence data is rather short and the file contains full annotation.

Unfortunately, it doesn’t contain retrotransposons. I’m particularly interested in identifying LINEs and SINEs. Where could I get that information from? I tried Ensembl Biomart but was unable to find the appropriate filter there. The Ensembl API also contains a promising method – TranscriptAdaptor::fetch_all_by_biotype but I cannot find a list of biotypes anywhere, and if I read the ncRNA documentation correctly, Ensembl simply doesn’t annotate retrotransposons at all.

Is there any way to obtain that information?

annotation rna ensembl • 4.0k views
ADD COMMENT
1
Entering edit mode
12.1 years ago

Answering my own question:

Ensembl Biomart actually allows filtering by Gene type: “retrotransposed”.

However, this only works in the current build of Ensembl and I am unfortunately forced to use an archived build since I’m working on the genome version NCBIM37.

ADD COMMENT
7
Entering edit mode
12.1 years ago
nbartonicek ▴ 160

This should help you with fetching the repeats: http://www.ensembl.org/info/docs/api/core/core_tutorial.html#repeats

I use:

my $registry = 'Bio::EnsEMBL::Registry';
$registry ->load_registry_from_url("mysql://anonymous\@ensembldb.ensembl.org/$ensembl");
my $ra_core = $registry->get_adaptor( $species, 'Core', 'RepeatFeature' );
ADD COMMENT
0
Entering edit mode

Are LINEs and SINEs annotated in repeats though? Anyway, I’m gonna try this, thanks.

ADD REPLY
0
Entering edit mode

the repeat track in Ensembl reports all repeats reported by RepeatMasker, so it includes LINEs/SINEs

ADD REPLY
0
Entering edit mode
12.1 years ago
Ashwin ▴ 110

Probably you should check Dfam

ADD COMMENT
1
Entering edit mode

That database only contains repeats from the human genome, for now anyway. It appears the OP is interested in repeats from mouse.

ADD REPLY
0
Entering edit mode
12.1 years ago
nbartonicek ▴ 160

The ribosomal RNA is a problem. See the discussion here: Where To Find Detailed Information On Genomic Location Of Rrna Genes?

ADD COMMENT
0
Entering edit mode
12.1 years ago
SES 8.6k

You can download all the repeats for a specific taxon, mouse for example, from Repbase. Just follow the "Browse RU" link, select the taxon you want, and download the file of repeats. It's free, but you will have to create an account.

ADD COMMENT

Login before adding your answer.

Traffic: 2070 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6