Igenomes...Which To Choose?
3
1
Entering edit mode
12.8 years ago
Huw ▴ 10

Can anyone shed some light on the relative merits of the different human iGenomes data sets i.e. Ensemble/NCBI/UCSC and which is most suited to use for basic gene expression analysis using the tuxedo suite.

Huw

next-gen reference • 3.7k views
ADD COMMENT
2
Entering edit mode
12.5 years ago
deanna.church ★ 1.1k

hg19 is the same as GRCh37 (http://www.ncbi.nlm.nih.gov/assembly/2758/). Since the release of GRCh37, the GRC (http://genomereference.org) has been releasing genome patches (http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/patches.shtml). Ensembl and NCBI annotate patch releases, but not always the same ones. For example, NCBI is showing GRCh37.p5 and Ensembl is showing GRCh37.p7. In all of these cases, the chromosome coordinates are identical- the only difference between GRCh37 and any patch release are the patches.

ADD COMMENT
1
Entering edit mode
12.8 years ago
Ying W ★ 4.3k

It is my understanding that the diff between Ensemble/NCBI/UCSC is the sequence that you are aligning to. If you are going to be visualizing all your results on UCSC genome browser using hg19 assembly then go w/the UCSC one. NCBI/Ensemble might have newer human genome reference assemblies or assemblies that include supercondigs and mitochondria. If you have the space, you can download all of them and compare what is different between them (they will take quite a bit of space since most are about 10GB compressed).

ADD COMMENT
0
Entering edit mode
12.8 years ago
Plasmid ▴ 160

I think you can use most recent one. For example:

Ensembl GRCh37 9696 MB Oct 24 2011

NCBI build37.2 11786 MB Oct 24 2011

ADD COMMENT

Login before adding your answer.

Traffic: 2152 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6