BioMart up-to-date with NCBI ?
1
1
Entering edit mode
9.6 years ago

Hi all,

I want to convert mouse (MGI) gene symbols to entrez gene ids by using BioMart's R interface.

I every now and then come across a mouse gene symbol for which BioMart does not find an entrez gene id but interestingly I can find an entrez gene id in the NCBI web site.

For example I use the following R code to try to find an entrez gene id for a gene symbol 0610009E02Rik:

library("biomaRt")
ensembl=useMart("ensembl")
ensembl = useDataset("mmusculus_gene_ensembl", mart=ensembl)
geneSymbs = c("0610009E02Rik")
geneSymbsEntrezGenes <- getBM(attributes=c('mgi_symbol', 'entrezgene'), filters='mgi_symbol', values=geneSymbs, mart=ensembl)

Then I can see it did not find an entrez gene id by giving the following R command:

> geneSymbsEntrezGenes
     mgi_symbol entrezgene
1 0610009E02Rik         NA

However I can find an entrez gene id for this gene symbol in the NCBI web site: http://www.ncbi.nlm.nih.gov/gene/?term=0610009E02Rik

So, to me it seems the version of BioMart (biomaRt_2.18.0) I am using is not up-to-date with NCBI.

Is it perhaps so that the BioMart is compiled periodically (e.g once a month, every second month) from NCBI?

If this is the case, should I perhaps just access NCBI directly and forget BioMart if I want to be sure I get most up-to-date conversions?

Or am I perhaps using an out-dated version of BioMart?

Thanks,
Erno Lindfors

NCBI BioMart • 5.0k views
ADD COMMENT
1
Entering edit mode

It looks like you want to map MGI accessions to entrez gene IDs, I've had trouble getting consistent results from BioMart at times. I'm seeing what looks like the problem you're having using the biomart webtool on the ensembl website.

You might want to get the accessions directly from Jax: ftp://ftp.informatics.jax.org/pub/reports/index.html#marker

There's a file listed there called "MGI Marker associations to Entrez Gene (tab-delimited)", this might be an easier way of getting the most up to date data.

ADD REPLY
0
Entering edit mode

Thanks Joe and Emily both for your comments!

I tested the file in Jax web site indeed seems to contain most up-to-date conversions.

With best regards,
Erno Lindfors

ADD REPLY
3
Entering edit mode
9.6 years ago
Emily 24k

BioMart comes from Ensembl. That means that you're getting the current Ensembl data from BioMart. Ensembl map MGI IDs to Ensembl genes and Entrez IDs to Ensembl genes. That means that when you're converting an MGI ID to an Entrez ID you're actually converting MGI->Ensembl->Entrez, so if any of those steps are missing, you won't get the conversion. The second point is that BioMart will be accessing the current Ensembl, so will be fetching our most recent data freeze of MGI and Entrez. Anything newer will not be picked up.

ADD COMMENT
0
Entering edit mode

Good to know that it always maps things through ensembl stable IDs.

ADD REPLY

Login before adding your answer.

Traffic: 2781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6