r2 correlation interpretation snp in plink pruning
1
1
Entering edit mode
10.4 years ago
Floris Brenk ★ 1.0k

Hi all,

Plink has the function "Linkage disequilibrium based SNP pruning" which is --indep 50 5 2 where the 2 stands for the vif threshold (VIF is 1/(1-R^2)) which means in this case r2 = 0,50.

So Linkage disequilibrium is the non-random association of alleles. I'm a bit strugling what for example an r2 of 0.5 means and how plink calculates this. Does 0.5 just mean a correlation of 0.5 between two snps? Can anyone explain to me a bit more what this 0.5 actually mean in real numbers. For example when I have 100 samples how many snps need to be in perfect LD to reach a r2 of 0.5?

r2 plink SNP pruning • 8.9k views
ADD COMMENT
6
Entering edit mode
10.4 years ago

It's the squared correlation coefficient between the 0/1/2 allele counts. I.e. ((Cov(marker 1 allele counts, marker 2 allele counts))^2) / (Var(marker 1 allele counts) * Var(marker 2 allele counts)).

The "number of SNPs in perfect LD" required depends on the exact distribution of allele counts for each marker; for example, if both markers have empirical MAF 0.01, even 99 samples in "perfect LD" is not enough to guarantee r^2 >= 0.5.

ADD COMMENT

Login before adding your answer.

Traffic: 2725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6