Hi all, I have been doing variant calling analysis in human cancer exomes and some variants in the VCF file have genotype like 4/6 or 2/3, is there any explanation for this fact, since I expected to see only 0/1, or 1/1? Thank you very much
Hi all, I have been doing variant calling analysis in human cancer exomes and some variants in the VCF file have genotype like 4/6 or 2/3, is there any explanation for this fact, since I expected to see only 0/1, or 1/1? Thank you very much
is there any explanation for this fact
yes, read the VCF spec: https://samtools.github.io/hts-specs/VCFv4.2.pdf
>
ALT - alternate base(s): Comma separated list of alternate non-reference alleles. The...
(...)
GT : genotype, encoded as allele values separated by either of / or | . The allele values are 0 for the reference allele (what is in the REF field), 1 for the first allele listed in ALT, 2 for the second allele list in ALT and so on.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Genotypes work like this:
Where n has to be < (strictly lower than) the number of letters that compose your base space. In case of DNA, you have four (A,C,T,G), so you can expect at most 4-1=3 alternative alleles.
My suspicion is that you're calling variants in a very homebrew way and something is messing up your genotyping. If not: can you post your variant calling pipeline?
The alternative alleles can contain multiple letters so this assertion that it is less that the "base space" e.g. ACGT is false. You can see things like
alternative_alleles=TC,TCC,TCCC
E.g. it can have multiple letters.
That is true, if you allow multiple nucleotide polymorphisms. My bad, I was imprecise :)
Thank you for the answers, the fact that I do not understand is how the calling in a diploid genome can yield variants with more than 3 alternative alleles (reference, copy 1 and copy 2) if only 1 sample is being used, is that possible?
@cmdcolin literally answered this question of yours.