Entering edit mode
4.0 years ago
igorm
▴
20
Hi,
I have a multialleic record in VCF file:
1 15274 rs62636497 A G,T . PASS DR2=0.08,0.08;AF=0.454,0.5454;IMP GT:DS:AP1:AP2:GP 1|2:0.91,1.09:0.6,0.4:0.31,0.69:0,0,0.19,0,0.54,0.28
From the record I can see that the genotype is G|T.
Then I split multiallelic record to biallelic via bcftools norm and I get:
1 15274 rs62636497 A G . PASS DR2=0.08,0.08;AF=0.454;IMP GT:DS:AP1:AP2:GP 1|0:0.91:0.6:0.31:0,0,0.19
1 15274 rs62636497 A T . PASS DR2=0.08,0.08;AF=0.5454;IMP GT:DS:AP1:AP2:GP 0|1:1.09:0.4:0.69:0,0,0.28
To get G|T genotype I need to read the non-ref from both bialleic records and combine them. Is this the rule I should apply? Or is the genotype information lost in such cases (I noticed gatk LeftAlignAndTrimVariants gives me ./. for both records for this snp when I do the --split-multi-allelics on the multiallelic record)?
Thanks!