Ensembl Perl API
0
1
Entering edit mode
8.2 years ago
baxy ▴ 170

Hi,

Does anyone know if it is possible to get the information for plants and fungi via perl API from ensembl? So far I've been working only with "popular" species (Human, mouse, fruitfly) and the info can easily be obtained from ensembldb.ensembl.org. However the core data for fungi and plants does not appear to be located there (roght?). So just a quick example of what I am doing:

$registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
   -host => 'ensembldb.ensembl.org',
   -user => 'anonymous'
);
my $slice_adaptor = $registry->get_adaptor( 'Mouse', 'Core', 'Slice' );
my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '1');
process stuff....

Any advice ?

thnx

Perl Ensembl • 2.7k views
ADD COMMENT
0
Entering edit mode

I think it should be possible, I have only really used Ensembl for vertebrate genomes only - but the fungi section says that the Ensembl Perl API can be used to access Ensembl Fungi Data: http://fungi.ensembl.org/tools.html.

This page has some more info: http://ensemblgenomes.org/info/access/api

In contrast to Ensembl, Ensembl Genomes provide six different Ensembl Compara databases, for each of the five divisions plus the pan-taxonomic compara. These can be selected from the Registry using the division ("metazoa", "plants", "fungi", "protists", "bacteria"; or "pan_homology" for the pan-taxonomic compara) as the "species" name e.g.

When working with larger numbers of genomes e.g. Ensembl Bacteria, Fungi and Protists, easier selection of genomes of interest is provided by an auxillary Ensembl Genomes Perl API: http://ensemblgenomes.org/info/access/eg_api

ADD REPLY
0
Entering edit mode

well that's the thing if i try for any non ensembldb core lika all fungi , and plants:

my $slice_adaptor = $registry->get_adaptor( 'Arabidopsis thailana', 'Core', 'Slice' );
my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '1');

I get

Can't call method "fetch_by_region" on an undefined value

And this is because I get undef for $slice_adaptor. I tried all sort of acronims and alternative names and taxids but it appears plants and fungi are not on ensembldb.ensembl.org. plus when I list then all plants are not there.

So my question is where are they located??

ADD REPLY
1
Entering edit mode

The page: http://ensemblgenomes.org/info/access/api seems to suggest using something like:

my $genome_db_adaptor = Bio::EnsEMBL::Registry->get_adaptor(
    'fungi', 'compara', 'GenomeDB');

But that would presumably return all fungi. Then the page: http://ensemblgenomes.org/info/access/eg_api seems to suggest something like:

use strict;
use warnings;
use Bio::EnsEMBL::LookUp;
my $lookup = Bio::EnsEMBL::LookUp->new();
my @dbas = @{$lookup->get_all_by_taxon_id(4751)};

4751 is the ncbi taxID for fungi.

ADD REPLY
0
Entering edit mode

jap it seams plants and fungi are not in registry which means my scripts need rewriting so for those that have the same problem:

use Bio::EnsEMBL::LookUp;
my $lookup = Bio::EnsEMBL::LookUp->new();
my $ad = $lookup->get_by_name_exact('arabidopsis_thaliana');
my $genes = $ad->get_GeneAdaptor()->fetch_all();
foreach my $gene (@$genes){do stuff}

much less code which is a plus but requires addapting. If anyone figures out how to extract all repetitions by chromosome in arabidopsis let me know

cheers

ADD REPLY
0
Entering edit mode

I'm not certain this is the only issue, but you spelled "thaliana" wrong. Also, I recommend following best practices and putting 'use strict;' and 'use warnings;' in your script. That will make debugging so much easier.

ADD REPLY

Login before adding your answer.

Traffic: 2562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6