Entering edit mode
11 months ago
Sd
•
0
Is there any way to figure out and be sure if a VCF file is individually called or jointly called? Is there any line in the VCF header to look at for this?
I am no expert on VCF files but wouldn't a VCF file with multiple sample names in the header column tell you that it's a joint call?
if they used bcftools to merge a bunch of gvcfs then it wouldn't be a joint genotyping in the same way GATK performs it, which leverages quality information from many samples to infer artefactual variants. Either way there should be a line in the header.
I have GenotypeGVCFs command lines in the header of each VCFs but I am not sure whether they used bcftools or GATK tools (either
CombineGVCFs
orGenomicsDBImport
).GenotypeGVCFs would indicate GATK joint genotyping was used. mystery solved.
I'd suggest that if you see variants that fail filters in some samples, due to very low (but nonzero) allele frequency, the data was likely jointly called. Unless it's a gVCF.