Entering edit mode
23 months ago
DavidStreid
▴
90
Hi,
GATK recommends GVCF-mode in its Best Practices for Germline SNP/inDel discovery. However, for a single sample and not cohort analysis, does GVCF mode offer any other benefit for analysis (e.g. improved precision, sensitivity) other than speed/efficiency?
i.e.
For a single sample, why use GVCF-mode,
HaplotypeCaller ... --emit-ref-confidence GVCF -O sample.g.vcf
GenotypeGVCFs ... --variant sample.g.vcf -O sample.vcf
instead of outputting the VCF directly
HaplotypeCaller ... -O sample.vcf
Thank you - any advice or notes from personal experience would be appreciated.
Best,
David
There is really no point in creating a GVCF and throwing it away; however I would argue that the GVCF itself is preferable to the standard VCF for single samples, as it allows you to answer questions like "Could my sample have <variant X>"? You can look at the GVCF to see if the site is strongly homozygous reference, or poorly covered; which you cannot do with the standard VCF; and the genotypes at non-reference locations should be identical.
right, the GVCF will distinguish reference from no-call. in the VCF that's lost. if the work is truly a single-sample then yes, skipping a step is ok. but most everything revolves around other samples.
Makes sense - thank you, LChart & Jeremy!