download/cache ensembl biomaRt data for genes and their annotations?
0
0
Entering edit mode
8.6 years ago
biocyberman ▴ 870

I am working with human and rat gene sets. From a list of ensembl gene IDs, I want to retrieve columns of attributes via biomaRt. With about 4000 genes, the process runs very slowly (30 minutes). I can save the R object and use it for next times. But is there anyway for me to download a whole package of gene annotation information with gen ontology, RFAM, PFAM, Interpro, etc? In particular, I am interested in downloading the following attitutes.

This is a snippet for what I am trying to do:

library(biomaRt)
#Example of 20 gene ids.
 ensids <- c(
'ENSRNOG00000000001', 
'ENSRNOG00000000009', 
'ENSRNOG00000000040', 
'ENSRNOG00000000055', 
'ENSRNOG00000000082', 
'ENSRNOG00000000091', 
'ENSRNOG00000000129', 
'ENSRNOG00000000137', 
'ENSRNOG00000000138', 
'ENSRNOG00000000142', 
'ENSRNOG00000000156', 
'ENSRNOG00000000187', 
'ENSRNOG00000000196', 
'ENSRNOG00000000231', 
'ENSRNOG00000000233', 
'ENSRNOG00000000239', 
'ENSRNOG00000000277', 
'ENSRNOG00000000288', 
'ENSRNOG00000000307', 
'ENSRNOG00000000321')

m <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = "rnorvegicus_gene_ensembl")

enstable <- getBM(mart = m, attributes = c('ensembl_gene_id','gene_biotype',
                                       'external_gene_name', 'superfamily',
                                       'family', 'go_id','goslim_goa_accession',
                                       'rfam', 'pirsf','interpro','tigrfam'),
              filters = c('ensembl_gene_id'), values = ensids)

Even though the first time download may take more time, but I see much greater benefits of subsequent uses: leave the ensembl server unstressed with repeated queries, shorter runtime, and internet independent.

ensembl biomaRt • 2.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 2253 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6