Entering edit mode
7.4 years ago
lordoftheowl
▴
10
Hello,
I am working with a VCF file with the format
##FORMAT=<ID=GT:,Number=1,Type=String,Description="Best Guessed Genotype with posterior probability threshold of 0.9">
##FORMAT=<ID=GL,Number=3,Type=Float,Description="Posterior probability of 0/0, 0/1, and 1/1">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Dosage">
Leading to data which looks like
./.:0.009,0.758,0.233:1.224
That is, the GT column only contains a genotype if any of the probability surpasses the 0.9 threshold.
Right now what I would like to do is to remove the threshold. Given the GL values for each site, I would like to write the most likely genotype (i.e. the max of the GLs) into GT. Can someone suggest a way to go about this?
Misha
Hi, I have the exact same problem/question, with the exception that my file has ".|" instead of "./". Misha, have you fund a solution?