New to this Biostar site and thus please accept my apology if the following question is considered "inappropriate" for this forum.
Not sure if the question is more genetics or bioinformatics (or neither) - Did a lot of googling but still could not figure it out Possibility - (a) my misunderstanding of the definition of AD (allelic depths for the ref and alt alleles in the orders listed) (b) my misunderstanding of the definition of GT (genotype where 0=ref, and 1=first alt allele) (c) others
I have a trio exome seq data. One of the SNPs is listed below -
REF ALT QUAL FORMAT Child Father Mother
A G 583.14 GT:AD:DP:GQ:PL 1/1:208,34:244:14.96:118,15,0 0/1:186,51:241:8.72:80,0,9 0/0:226,1:236:3.01:0,3,30
The child's GT is 1/1. This means the child's alleles are G/G. However, the child's AD is 208,34, based on the "definition", it means to me that the child has more ref allele (A) count than alt allele (G) count. So does AD 208,34 means the child's GT should be A/G ? In contrast, the Dad's GT is 0/1. So he has A/G, which is consistent with AD 186,51. The mother's GT is 0/0. This means she has A/A. Her AD is 226,1. To me, this AD and GT are still consistent and they both mean she has a A/A alleles (the 1 count for G might be artifact).
Can anyone please help clarify this obvious discrepancy between child's GT and parents' GT? Any insight will be greatly appreciated (hope the question is relevant enough to this forum that it would not be closed)
Great many thanks
I've also seen something similar in locations that match a region of a pseudogene.
Agreed, the GQ and PL scores give me no confidence that those genotypes will stand up to closer scrutiny. You might want to revisit your filtering processes for screening variants.
Great many thanks. Yeah, I will need to "clean up" the reads a bit and remove those that are not called with high confidence based on GQ and PL.