Entering edit mode
8.1 years ago
baxy
▴
170
Hi,
Does anyone know if it is possible to get the information for plants and fungi via perl API from ensembl? So far I've been working only with "popular" species (Human, mouse, fruitfly) and the info can easily be obtained from ensembldb.ensembl.org. However the core data for fungi and plants does not appear to be located there (roght?). So just a quick example of what I am doing:
$registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org',
-user => 'anonymous'
);
my $slice_adaptor = $registry->get_adaptor( 'Mouse', 'Core', 'Slice' );
my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '1');
process stuff....
Any advice ?
thnx
I think it should be possible, I have only really used Ensembl for vertebrate genomes only - but the fungi section says that the Ensembl Perl API can be used to access Ensembl Fungi Data: http://fungi.ensembl.org/tools.html.
This page has some more info: http://ensemblgenomes.org/info/access/api
In contrast to Ensembl, Ensembl Genomes provide six different Ensembl Compara databases, for each of the five divisions plus the pan-taxonomic compara. These can be selected from the Registry using the division ("metazoa", "plants", "fungi", "protists", "bacteria"; or "pan_homology" for the pan-taxonomic compara) as the "species" name e.g.
When working with larger numbers of genomes e.g. Ensembl Bacteria, Fungi and Protists, easier selection of genomes of interest is provided by an auxillary Ensembl Genomes Perl API: http://ensemblgenomes.org/info/access/eg_api
well that's the thing if i try for any non ensembldb core lika all fungi , and plants:
I get
And this is because I get undef for $slice_adaptor. I tried all sort of acronims and alternative names and taxids but it appears plants and fungi are not on ensembldb.ensembl.org. plus when I list then all plants are not there.
So my question is where are they located??
The page: http://ensemblgenomes.org/info/access/api seems to suggest using something like:
But that would presumably return all fungi. Then the page: http://ensemblgenomes.org/info/access/eg_api seems to suggest something like:
4751 is the ncbi taxID for fungi.
jap it seams plants and fungi are not in registry which means my scripts need rewriting so for those that have the same problem:
much less code which is a plus but requires addapting. If anyone figures out how to extract all repetitions by chromosome in arabidopsis let me know
cheers
I'm not certain this is the only issue, but you spelled "thaliana" wrong. Also, I recommend following best practices and putting 'use strict;' and 'use warnings;' in your script. That will make debugging so much easier.