Question

Differences Between Reference Human Genome Assemblies From Different Sources

4

Entering edit mode

11.7 years ago

alpha2zee ▴ 120

I am relatively new to analysis of whole transcriptome RNA sequencing data. I am planning to map human RNA sequencing reads against the reference human genome/transcriptome (i.e., generate BAM files from fastq files).

I notice that reference genome assemblies are available from a number of sources: UCSC (currently as hg19), Ensembl (currently as GRCh37.73), 1000 Genome project (currently as v37), etc. All of these releases seem to be based on Genome Research Consortium's GRCh37 release.

(1) What are the differences between such different genome assemblies?

(2) What are the differences between the different releases from Ensembl (e.g., GRCh37.70 vs .71)?

(3) For my purpose, aligning raw reads to obtain gene expression data for differential expression analysis, does it matter if one used a particular GRCh37-based reference assembly for a group of samples, and, in the future, for another group of samples used a different GRCh37-based assembly (either a different source or the same source but a different release)?

(4) Finally, can I use the reference genome assembly from one source or release and a gene annotation file from another source or release as long as they all are based on GRCh37?

Thank you.

rna-seq • 14k views

ADD COMMENT • link updated 5.7 years ago by MatthewP ★ 1.4k • written 11.7 years ago by alpha2zee ▴ 120

score 4 · Answer 1 · 2013-11-22

1) What are the differences between such different genome assemblies?

see http://plindenbaum.blogspot.fr/2013/07/g1kv37-vs-hg19.html

2) What are the differences between the different releases from Ensembl

see What's the difference between two versions of the same assembly ?

3)

For human, I would say you'd better use the data of the GATK bundle to stay close to their pipeline

4) yes but you'll' have to verify that they use the same names for the chromosomes (e.g. "chr" prefix)

score 3 · Answer 2 · 2019-11-03

3

Entering edit mode

5.7 years ago

MatthewP ★ 1.4k

https://software.broadinstitute.org/gatk/documentation/article?id=23390

ADD COMMENT • link 5.7 years ago by MatthewP ★ 1.4k