Get Ensembl Gene By Species/Strain For E. Coli
1
1
Entering edit mode
13.4 years ago
Jirapong ▴ 30

Based on http://bacteria.ensembl.org/index.html data. For E. coli database is use single as

escherichiashigellacollectioncore9625a

I'm using Ensembl API version 62 to connect without problem. However, when try to get genes from "e colisakai" and "e colik12". They return same amount of genes (4511).

But if i search on Website - http://bacteria.ensembl.org/Multi/Search/Results?species=all;idx=;q=EBESCG00000001004;genomic_unit=bacteria

I can see this gene specify to only E. coli K12.

My perl script looks like this

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'E coli K12', -dbname=> 'escherichia_shigella_collection_core_9_62_5a');
my $slice_adaptor = $db->get_SliceAdaptor();
$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome'
my @genes = @{ $chromo->get_all_Genes() };
print(Dump(@genes));

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'E coli Sakai', -dbname=> 'escherichia_shigella_collection_core_9_62_5a');
my $slice_adaptor = $db->get_SliceAdaptor();
$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome'
my @genes = @{ $chromo->get_all_Genes() };
print(Dump(@genes));
ensembl • 3.4k views
ADD COMMENT
2
Entering edit mode
13.4 years ago
Andeyatz ▴ 70

What has happened here is that you have not created a multi-species database adaptor to the different strains. There are two ways of getting around this. The first is to use the load registry from DB which will deal with this for you; the second is to explicitly create the DB.

#Using the registry
Bio::EnsEMBL::Registry->load_registry_from_db(-HOST => 'mysql.ebi.ac.uk', -PORT => 4157, -USER => anonymous);
my $ecoli_k12 = Bio::EnsEMBL::Registry->get_DBAdaptor('e_coli_k12', 'core');
my $sakai = Bio::EnsEMBL::Registry->get_DBAdaptor('e_coli_sakai', 'core');

Now for the more manual version

-- SQL to find the species -> species id
select species_id, meta_value from meta where meta_key = 'species.db_name';

This will run a query which at the time of writing e_coli_k12 is species id 1 & sakai is species id 12. So using this information we can go back to your original example:

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'e_coli_k12', -dbname=>'escherichia_shigella_collection_core_9_62_5a', -species_id => 1, multispecies_db => 1);
my $slice_adaptor = $db->get_SliceAdaptor();
$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome'
my @genes = @{ $chromo->get_all_Genes() };
print(Dump(@genes));

$db = new Bio::EnsEMBL::DBSQL::DBAdaptor(-species=> 'e_coli_sakai', -dbname=> 'escherichia_shigella_collection_core_9_62_5a', -species_id => 12, -multispecies_db => 1);
my $slice_adaptor = $db->get_SliceAdaptor();
$chromo = $slice_adaptor->fetch_by_region('chromosome', 'Chromosome'); # E. Coli chromosome is name "chromosome'
my @genes = @{ $chromo->get_all_Genes() };
print(Dump(@genes));

HTH

ADD COMMENT
0
Entering edit mode

Of course, this help me a lot. Thank you so much.

ADD REPLY

Login before adding your answer.

Traffic: 1544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6