Plink HWE QC question
1
0
Entering edit mode
5 months ago
janny.lau ▴ 10

Hello, I have been trying to do some QC on my SNP chip data using PLINK (I am using Rstudio to access PLINK). And I am very confused about the HWE qc.

One of the QC methods for PLINK is the Hardy Weinberg Equilibrium (--hwe). From my understanding, SNPs, where the chance of this deviation from HWE is due to random variation (with a specified p-value), will be excluded from further analysis - meaning that they deviate significantly from HWE. So, that would mean the smaller p-value you use, you are setting a more stringent HWE and that means fewer SNPs will be removed?

Therefore, small p-value = more SNPs retained and high p-value = less SNPs retained?

I honestly am very uncertain about this. If someone could please help me clear this up, I would appreciate it so very much.

HWE Plink p-value • 675 views
ADD COMMENT
0
Entering edit mode
5 months ago
bk11 ★ 3.0k

You're close, but there's a little mix-up in the interpretation. Let me clarify how the Hardy-Weinberg Equilibrium (HWE) test works in PLINK:

HWE Test and p-Value: The HWE test checks if the genotype frequencies deviate from expected frequencies under Hardy-Weinberg Equilibrium. The p-value from this test indicates how likely it is to observe a deviation as extreme as the one observed, given that the SNP is in equilibrium.

Interpreting p-Values:

Small p-Value: A small p-value (e.g., <0.0001) indicates that the SNP is significantly out of HWE. This suggests that there might be some problem with the SNP, such as genotyping errors, population stratification, or selection pressures. Smaller p-values lead to more SNPs being flagged for removal because they indicate stronger evidence of deviation from HWE.

Large p-Value: A large p-value suggests that the SNP does not show a significant deviation from HWE and is likely to be in equilibrium. Larger p-values mean fewer SNPs are flagged for removal.

Setting the Threshold: When you use the --hwe flag in PLINK, you specify a p-value threshold. SNPs with p-values below this threshold are excluded from further analysis. For example, --hwe 1e-6 will exclude SNPs with p-values less than 10e-6, which is a very stringent threshold, leading to more SNPs being removed. Conversely, a threshold of --hwe 0.05 is less stringent, leading to fewer SNPs being excluded.

In summary:

Smaller p-value threshold: More stringent criteria, more SNPs removed.

Larger p-value threshold: Less stringent criteria, fewer SNPs removed.

I hope this clears up the confusion!

ADD COMMENT

Login before adding your answer.

Traffic: 1268 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6