background exome gVCFs for haplotype calling
0
0
Entering edit mode
4.3 years ago
ccagg ▴ 60

I have a relatively small sample of human exomes (n=11) that I would like to call SNPs for, using the GATK pipeline. From reading the GATK documentation, it seems that the best way to do this is to use many exomes as "background" for the genotype calling and refinement steps. My lab, however, is very new to exome-seq, and we only have the 11 we generated on hand.

Is there a database somewhere of exome data that I could use as the background? I think gVCFs would be preferable, but could potentially work from fastq or bam if necessary. We have access to the UKBiobank, but it seems like there were some issues with their exome data that might dissuade me from using their gVCFs. If there isn't available exomes, would there be a problem with using genomic data (like the gVCFs available from HGDP) for this step?

exome GATK • 1.1k views
ADD COMMENT
0
Entering edit mode

GATK should offer resources if they recommend something in their pipeline. Can you show us the page where they make this recommendation?

ADD REPLY
0
Entering edit mode

Yeah I guess they don't explicitly say it, but here they definitely seem to elude to having a large cohort, but also several people at my institute have recommended that you run it with other background exomes. I think this makes sense since the refinement is a machine learning-based method.

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6