Any difference between biomaRt package and biomart tab on the website?
1
0
Entering edit mode
9.4 years ago
M K ▴ 660

Hi everyone,

I am trying to retrieve gene information from ensembl website to compare the the gene information for mouse(mm10) with repetitive DNA is specific genome regions (UTR'S and intron, and upstream). I did two ways to get these files the first one using the R code below, and the second one by going directly to ensembl website using biomart tab to get these files.

I have 2 issues, the first one that there is a difference in total observations(rows) in both ways (I mean the total rows in both files are different).

The second issue, when I start find the genes that sharing the same position with these specific regions for repetitive DNA I got empty file results, and I don't know what causes that. BTW, I downloaded the repetitive DNA files from UCSC website using ensemble genes in track tab.

R code to retrieve the gene info.

source("http://bioconductor.org/biocLite.R")
biocLite("biomaRt")
library(biomaRt)

### Retrieving mouse (mm10/GRCm38) from Ensembl website ###
mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")

mm10_Gene=getBM(attributes=c("ensembl_gene_id","chromosome_name",'strand','transcript_start','transcript_end', "mgi_symbol"),mart=mouse)
gene R Assembly sequencing • 3.0k views
ADD COMMENT
1
Entering edit mode
9.4 years ago
Ying W ★ 4.3k

As long as you are on the same release, the results should be the same (not sure how to tell which release the bioconductor package is using but it might be a couple releases behind the website).

Could you give an example of a gene in repetitive DNA that you can find in website but not through biomaRt?

ADD COMMENT
0
Entering edit mode

Hi Ying,

I used mouse(mm10) release, which is the latest release. Then I used table browser in UCSC to download the repetitive DNA and in the track tab I used ensembl genes then I got for example Introns plus region from the get output tab. since UCSC doesn't provide the gene info for ensemble genes specially mgi-symbols I retrieve the gene info from ensembl website directly or by using the r code above.

ADD REPLY
0
Entering edit mode

not the mouse reference, but the annotation release, if you look on the ensembl website it is currently on release 80. UCSC is probably using a different release also, annotations are updated more often than reference is.

ADD REPLY
0
Entering edit mode

So is there any way to download the repetitive DNA from Ensembl website directly like the one on UCSC? For example I want to download the introinc, CDS, 10K upstream and 10k downstram for the mouse (mm10) and human(hg19). and I think by doing that the annotation data and repetitive DNA will be consist for this analysis since they are from the same source which is ensembl.

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1779 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6