When we have a variant with AD = 53, 50 for example or AD =800,80, does this just mean the variant is heterozygote, or it means we have low frequent variant since the reads that support the Alt is not higher than the ref ?
How can I relate AD to heterozygosity?
Hi Sharon, the AD (Allelic Depth) relates to the 'high quality' reads supporting the calls listed in the REF and ALT field of the VCF record in question, respectively. Your variant therefore has 80 reads supporting it, with the reference base having 800.
If your sample was germline, then most variant callers would not call this as heterozygous because the allelic fraction of the variant is just 0.09% ((80 / (800+80)) * 100). In order to call it, you would have to drastically lower the thresholds; however, in doing so, you will introduce many false positive calls elsewhere.
On the other hand, numbers with an inbalance of this level are typically seen in cancer samples, where a particular tumour clone in which the variant is being called may only comprise 10% of the cells that were sequenced from the original tumour biopsy (thus the frequency of the variant comes back at 9 or 10%).
You should look at other metrics in the VCF in combination with AD and AF in order to decide whether this call is actually genuine or not. Also, look at the read alignments over the region.
Thanks Kevin so much. That's very helpful . My concerns also should we also consider variants with AD=49,49 or AD=53,50.
How can we assure there is a variant here at this position when half the reads agree with the reference and half disagree. So I feel I am missing something, does 49,49 and 53,50 just means significant variant but heterozygote? Or it means we should discard the variant too because of uncertainty (half/half) ?
A 50% split between the reads indicates heterozygous, i.e., the variant is found on either the maternal or paternal chromosome, but not both (humans being diploid organisms). Something closer to 100% in favour of the variant is evidence of a homozygous variant, i.e., one in which the variant is found on both the maternal and paternal allele.
Thanks Kevin so much. That's very helpful . My concerns also should we also consider variants with AD=49,49 or AD=53,50. How can we assure there is a variant here at this position when half the reads agree with the reference and half disagree. So I feel I am missing something, does 49,49 and 53,50 just means significant variant but heterozygote? Or it means we should discard the variant too because of uncertainty (half/half) ?
Thanks
A 50% split between the reads indicates heterozygous, i.e., the variant is found on either the maternal or paternal chromosome, but not both (humans being diploid organisms). Something closer to 100% in favour of the variant is evidence of a homozygous variant, i.e., one in which the variant is found on both the maternal and paternal allele.
Thanks Kevin, much appreciated, its clear now :)