Entering edit mode
6.2 years ago
genie66
▴
30
This might be a very basic question for many here. With the basic understanding of inheritance, eventhough there is a possibility of multiple genotypes due to multi alleles, the resulting genotype can only have two alleles(paternal and maternal) that way after variant calling, an allele at one position can be homozygous or heterozygous. So there can be max of two alleles, but why do we see multiple alleles at a given position in single sample VCF. I am trying to understand the science behind this. Please help out. Thanks!
Eg:
chr5 127640782 . AG A,AA . . . GT:AD:DP 1/2:0,28,409:437
for one sample, yes. If there is more than one sample in the VCF for the same variant: sample 1 is A/T, sample2 is T/G ...
To add to what Pierre said, even in situations where there is a single sample, multi-alleles appear due to the fact that NGS data has such high error rates. It is difficult for the in silico tools to deal with this messy data in a way that is faithful.
Indels should dominate the multi-allelic regions, too, as indels are inherently difficult to call for a variant caller.
Please don't ask the same question on multiple boards at the same time. If you already have a good answer on one board, it's a waste of people time to repeat it here.
Some times, both maternal and paternal are alternate alleles if reference is not exact reference.