Entering edit mode
10.7 years ago
pld
5.1k
I want to use the BioMart to access ensembl with the registry ensembl_mart_75
, but I'm only interested in a specific list of species. When I run it now, I can see that it is gathering all species that are in ensembl, is it possible to specify which species to choose?
I am accessing Ensembl via the BioMart Perl module.
Suppose I run the following code:
use strict;
use BioMart::Initializer;
use BioMart::Query;
use BioMart::QueryRunner;
my $confFile = "conf.reg";
my $action='cached';
my $initializer = BioMart::Initializer->new('registryFile'=>$confFile, 'action'=>$action);
my $registry = $initializer->getRegistry;
my $query = BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
$query->setDataset("hsapiens_gene_ensembl");
$query->addAttribute("ensembl_gene_id");
$query->formatter("TSV");
$query_runner = BioMart::QueryRunner->new;
$query_runner->execute($query);
$query_runner->printResults;
If the configuration for my registry file has not yet been performed, it attempts to configure for all species in ensembl. This takes a while, I wanted to see if there was a way to speed this up by only configuring for species I want, rather than all of them.
You seem to have the species specified in your code:
Is it doing something you don't expect?
Any time I've used Ensembl, step two is "choose species", so I don't understand what you mean by "it is gathering all species that are in ensembl." Can you provide more specific details about how you are accessing Ensembl?
Thanks for the edit. I generally use R/biomaRt, not the Perl registry, so have not seen this issue. Hopefully our friendly Ensembl outreach have some ideas.
Sorry for the cruddy OP, I got distracted and submitted before I finished. Is this possible in biomaRt?
In biomaRt, I'd type:
and the useMart() function might take a few seconds to run, at most.