how to retrieve mouse (mm10) gene information from Ensemble using Biomart inR
2
5
Entering edit mode
9.4 years ago
M K ▴ 660

I am tying to retrieve mouse mm10 gene information using biomart library in R, but I don't know how to do that

(The information that I need are mm10.knownGene.name, mm10.knownGene.chrom, mm10.knownGene.strand, mm10.knownGene.txStart, mm10.knownGene.txEnd and mm10.kgXref.geneSymbol)

source("http://bioconductor.org/biocLite.R")
biocLite("biomaRt")

library(biomaRt)

mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")
listFilters(mouse)

getBM(attributes=c("ensembl_gene_id", "mgi_symbol"), filters= "mgi_symbol", mart=mouse)
gene R • 27k views
ADD COMMENT
0
Entering edit mode

What is mm10.kgXref.geneSymbol?

ADD REPLY
0
Entering edit mode

The kgXref.geneSymbol is the gene name that I got when downloaded the known gene from UCSC website using table browser

ADD REPLY
7
Entering edit mode
9.4 years ago
komal.rathi ★ 4.1k

If you just want mm10 symbol, chr, strand, transcript start & end, you could do this:

res <- getBM(attributes = c("ensembl_gene_id", "mgi_symbol","chromosome_name",'strand','transcript_start','transcript_end'), mart = mouse)

If you have list of genes:

#genesym is a character vector of gene symbols
res <- getBM(attributes = c("ensembl_gene_id", "mgi_symbol","chromosome_name",'strand','transcript_start','transcript_end'), filters = genesym, mart = mouse)
ADD COMMENT
0
Entering edit mode

Thanks Komal for helping me, I run that code and it works. But I have question about how can I assign the version release (i.e I am going to run this code with different species like human (hg19)) so here I don't know what release I will retrieve from ensemble. Is there any way to include release# in this code.

ADD REPLY
0
Entering edit mode

The current Ensembl database should be mm10, the most up to date one. For changing genomic versions see my reply.

ADD REPLY
7
Entering edit mode
9.4 years ago
Sakti ▴ 530
library(biomaRt)
ensembl <- useMart("ensembl", dataset="mmusculus_gene_ensembl")
annot<-getBM(c("ensembl_gene_id", "mgi_symbol", "chromosome_name", "strand", "start_position", "end_position","gene_biotype"), mart=ensembl)

# For older biomart releases I have used
ensembl <- useMart("ENSEMBL_MART_ENSEMBL", dataset="mmusculus_gene_ensembl", host="jul2012.archive.ensembl.org")

#modify the host depending on when your genomic version was released, or which ensembl archive you want to use
ADD COMMENT
0
Entering edit mode

I got it. Thanks a lot Sakti.

ADD REPLY

Login before adding your answer.

Traffic: 2375 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6