Hi. I always appreciate many your helps.
I have one vcf file (a.vcf). This file has one variant data. The data also has missing genotypes "./." because of DP=0. The variant is tri-allelic variant as below.
"a.vcf"
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample131 sample138 sample908
chr12 104350956 . G T,A 147880 PASS . GT:AD:DP:GQ ./.:0,0:0 0/1:25,22,0:47:99 0/0:36,0,0:36:99 ./.:0,0:0
I want to split the tri-allelic data into bi-allelic data, so I did the below command using GATK.
java -jar GenomeAnalysisTK.jar \
-T LeftAlignAndTrimVariants \
-R ${ref_path} \
--variant a.vcf \
-o b.vcf \
--splitMultiallelics \
--reference_window_stop 900
As a result, I got b.vcf. "b.vcf"
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample131 sample138 sample908
chr12 104350956 . G T 147880 PASS AC=1;AF=0.250;AN=4 GT:AD:DP:GQ 0/0:.:0 0/1:25,22:47:99 0/0:36,0:36:99 0/0:.:0
chr12 104350956 . G A 147880 PASS AC=0;AF=0.00;AN=4 GT:AD:DP:GQ 0/0:.:0 0/0:25,0:47:99 0/0:36,0:36:99 0/0:.:0
In the b.vcf, thw splited two variants were bi-allelic data, but the missing genotypes were set to "0/0". I want to remain the missing genotype after the process of GATK.
How should I process the file?
GATK's version is 3.6.
you should ask http://gatkforums.broadinstitute.org/gatk
Thank you for your advice.
I asked a GATK team to submit a bug report.
http://gatkforums.broadinstitute.org/gatk/discussion/8415/why-does-gatk-leftalignandtrimvariants-set-a-missing-genotype-to-0-0?