Dear all,
The pipeline I am working on will call sites in sequenced regions with bcftools. We have sequenced replicates of species we know to have the same genotype. The idea is to combine these replicates into one genotype for each species.
To do this I have written a python script that uses pyvcf to walk through all the records and combine these replicates. I ran into a snag with combining the Phred-scaled genotype likelihoods.
Right now I'm combining the score for each genotype by taking the max over the replicates. Preferably the genotype likelihood would be recalculated.
Would recalculating the PL be possible? Is there documentation on how the genotype likelihood is calculated?
i can only find this on how to calculate PL: https://software.broadinstitute.org/gatk/documentation/article?id=5913, verder alles goed Arlo :P groetjes koen^^