Best Human Genome Reference File For Gatk?
1
0
Entering edit mode
11.4 years ago
newDNASeqer ▴ 790

On GATK website http://gatkforums.broadinstitute.org/discussion/1204/what-input-files-does-the-gatk-accept and their public FTP server, I found a few difference human genome references in their resource bundle: b37, b36, hg18, and hg19. This made me wonder which one I should use for exome-sequencing data analysis?

In my pipeline, I started using BWA-MEM with the hg19 reference, should I stay consistent for the same reference with GATK? or it seems to me Broad Institute people recommend using b37, as they said hg18, b36, etc were lifted over from b37 - confused here. thanks

gatk reference bwa • 9.7k views
ADD COMMENT
0
Entering edit mode

My practical advice would be to perform your BWA-MEM alignment using b37.

ADD REPLY
4
Entering edit mode
11.4 years ago

Yes, you do seem a bit confused. I find it's best to take a look at this FAQ from 1000 Genomes.

This GRCh37-derived alignment set includes chromosomal plus unlocalized and unplaced contigs, the rCRS mitochondrial sequence (AC:NC_012920), Human herpesvirus 4 type 1 (AC:NC_007605) and decoy sequence derived from HuRef, Human Bac and Fosmid clones and NA12878.

So, it's derived from GRCh37, just as UCSC hg19 is, but contains a different mitochondrial sequence, a herpesvirus, and some other unplaced contigs and sequences. The most practical difference is that the contigs are named 17 instead of chr17, using human chromosome 17 as an example.

And finally, yes, once you have generated an alignment to a specific reference genome you need to use the same reference genome in all of your downstream analyses. There is no "switching" for any good reason that I can think of. It's like asking if you should use a French dictionary to decode and English text.

ADD COMMENT
1
Entering edit mode

And note the that the GATK contig/chromosome ordering is important for downstream processing, so pick a reference and stick with it throughout.

ADD REPLY
1
Entering edit mode

Yes, I had just gone back to make this point in an edit!

ADD REPLY
1
Entering edit mode

I switched references once in the GATK pipeline and was like walking on hot coals every step.

ADD REPLY

Login before adding your answer.

Traffic: 1609 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6