I need to analyze both variant and nonvariant sites from a vcf. I produced this with GaTK Haplotype caller (v 3.7), but I'm struggling to know how to filter the variant and nonvariant sites 'fairly'. Illumina states you can do this using the GQX value, but my GATK gvcfs and vcfs don't have that annotation. Does anyone know if it is only the isaac variant caller software that produces GQX values? And is it possible for me to calculate it post hoc?
-- Here is the description of GQX from illumina: Genotype Quality for Variant and Non-variant Sites
The gVCF file uses an adapted version of genotype quality for variant and non-variant site filtration. This value is associated with the key GQX. The GQX value is intended to represent the minimum of {Phred genotype quality assuming the site is variant, Phred genotype quality assuming the site is non-variant}. The reason for using this value is to allow a single value to be used as the primary quality filter for both variant and non-variant sites. Filtering on this value corresponds to a conservative assumption appropriate for applications where reference genotype calls must be determined at the same stringency as variant genotypes, i.e.:
• An assertion that a site is homozygous reference at GQX >= 30 is made assuming the site is variant. • An assertion that a site is a non-reference genotype at GQX >= 30 is made assuming the site is non-variant.
Hi Clare, I have a similar question. Were you able to find a solution?