Hi all,
I want to convert mouse (MGI) gene symbols to entrez gene ids by using BioMart's R interface.
I every now and then come across a mouse gene symbol for which BioMart does not find an entrez gene id but interestingly I can find an entrez gene id in the NCBI web site.
For example I use the following R code to try to find an entrez gene id for a gene symbol 0610009E02Rik
:
library("biomaRt")
ensembl=useMart("ensembl")
ensembl = useDataset("mmusculus_gene_ensembl", mart=ensembl)
geneSymbs = c("0610009E02Rik")
geneSymbsEntrezGenes <- getBM(attributes=c('mgi_symbol', 'entrezgene'), filters='mgi_symbol', values=geneSymbs, mart=ensembl)
Then I can see it did not find an entrez gene id by giving the following R command:
> geneSymbsEntrezGenes
mgi_symbol entrezgene
1 0610009E02Rik NA
However I can find an entrez gene id for this gene symbol in the NCBI web site: http://www.ncbi.nlm.nih.gov/gene/?term=0610009E02Rik
So, to me it seems the version of BioMart (biomaRt_2.18.0) I am using is not up-to-date with NCBI.
Is it perhaps so that the BioMart is compiled periodically (e.g once a month, every second month) from NCBI?
If this is the case, should I perhaps just access NCBI directly and forget BioMart if I want to be sure I get most up-to-date conversions?
Or am I perhaps using an out-dated version of BioMart?
Thanks,
Erno Lindfors
It looks like you want to map MGI accessions to entrez gene IDs, I've had trouble getting consistent results from BioMart at times. I'm seeing what looks like the problem you're having using the biomart webtool on the ensembl website.
You might want to get the accessions directly from Jax: ftp://ftp.informatics.jax.org/pub/reports/index.html#marker
There's a file listed there called "MGI Marker associations to Entrez Gene (tab-delimited)", this might be an easier way of getting the most up to date data.
Thanks Joe and Emily both for your comments!
I tested the file in Jax web site indeed seems to contain most up-to-date conversions.
With best regards,
Erno Lindfors