Entering edit mode
8.0 years ago
Lin
▴
30
Is there a tool that will incorporate the variants into a reference genome based on the genotype information (GT info) and the allele depth (AD info)?
So, for loci where there is a variant, the tool will look at the genotype, if it is heterozygous it will take allele with highest allelic depth and incorporate it to reference genome? If it is homozygous it will take the allele indicated in the GT info and incorporate it to reference genome.
Example heterozygous:
reference sequence: AGG
vcf:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S6
20 2 G GAG,GAA 626.73 PASS AC=1,1;AF=0.500,0.500;AN=2;DP=19;ExcessHet=3.0103;FS=0.000;MLEAC=1,1;MLEAF=0.500,0.500;MQ=59.85;NEGATIVE_TRAIN_SITE;POSITIVE_TRAIN_SITE;QD=29.87;SOR=4.977;VQSLOD=1.17;culprit=SOR GT:AD:DP:GQ:PL 1/2:0,4,70:11:99:664,307,281,182,0,147
The new sequence will be: AGAAG
Example homozygous:
reference sequence: AGG
vcf:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S6
20 2 G GAG,GAA 626.73 PASS AC=1,1;AF=0.500,0.500;AN=2;DP=19;ExcessHet=3.0103;FS=0.000;MLEAC=1,1;MLEAF=0.500,0.500;MQ=59.85;NEGATIVE_TRAIN_SITE;POSITIVE_TRAIN_SITE;QD=29.87;SOR=4.977;VQSLOD=1.17;culprit=SOR GT:AD:DP:GQ:PL 1/1:0,7,0:11:99:664,307,281,182,0,147
the new sequence will be:
AGAGG
GATK FastaAlternateReferenceMaker select the allele randomly. There is no option to select the allele based on allelic depth.
Also, as of July 2019,
FastaAlternateReferenceMaker
skips heterozygous ALT allelic sites.