Entering edit mode
10.4 years ago
Adrian Pelin
★
2.6k
Hello,
I have VCF files and would like to calculate LD r^2 values from allele frequencies since my data is not phased.
VCFTools doesn't seem to be working, it is reporting "-nan" values. Can anyone suggest other tools to do this?
Maybe my intuition is wrong, but I think one cannot compute LD from the allele frequencies alone.
Michael Dondrup and chrchang523 are correct. Allele frequency alone is NOT sufficient to calculate LD. If a program claims to only use allele counts it is using a phasing algorithm.
Found an interesting article here.
What confuses me about it, is that although the authors say the program calculates r^2 values, their approach suggests the computation of a different parameter. In their case they only calculate LD for pairs of bi-allelic SNPs close enough to be both present in one read. This causes their calculation to be based on haplotype frequencies. On the other hand, they are introducing an ML approach where allele frequency information is integrated in order to better estimate r^2, so that makes sense if I understand it right.
One disadvantage of this approach is that it only looks for pairs of SNPs very close to one another, and cannot calculate r^2 decay along a chromosome. I am still trying to grasp the basics of r^2 and what are the caveats in using it as opposed to measure D.