Question

Creating TxDB objects from different sources

0

Entering edit mode

4.4 years ago

mickey_95 ▴ 110

Hi!

I have been trying to understand the confusing world of genome annotations with the goal of consistently using the same version throughout all analyses for one project.

Initially I was using UCSC annotations by simply loading the corresponding library in R:

library(TxDb.Mmusculus.UCSC.mm10.ensGene)

This is where various questions started appearing: I thought that these are Ensembl-style annotations to the Ensembl reference genome. I then, however, read somewhere that these are UCSC's annotations to the Enbsembl genome sequence...is this correct?

Also, I noticed that the above TxDb object is not based on the newest release, so I decided to look into creating my own TxDB objects. Since for the alignment of the data, I used Ensembl's GRCm38 release 99, I would like to use the same annotation version for downstream analyses. For this, I found two options - using Biomart or Ensembl as resource. Considering that the default host URL of Biomart is ensembl.org, I would expect the code below to yield the same result - is this correct?

 # from Biomart
BiomartTxDb <- makeTxDbFromBiomart(biomart= "ensembl", dataset = "mmusculus_gene_ensembl")

# from Ensembl
EnsemblTxDb <- makeTxDbFromEnsembl(organism = "Mus musculus", release = 99)

One difference I noticed is that makeTxDbFromBiomart uses the most recent release (release 100) - is there an option to specify the release (I couldn't find one, maybe I overlooked something...)

I am also wondering about different naming conventions:

seqlevelsStyle(EnsemblTxDb) <- "UCSC"

If I change the naming of my EnsemblTxDb object to UCSC, would this simply change "1" to "ch1" without affecting the underlying positioning info? Or do also the coordinates get converted from 0-based to 1-based coordinates?

I find all of this very confusing and would like to get some input to be sure that I am understanding everything correctly.

TxDb genome annotation Biomart Ensembl • 1.8k views

ADD COMMENT • link 4.4 years ago by mickey_95 ▴ 110