Using Principal Components may muddle the detecting power of a GWAS?
0
0
Entering edit mode
10 days ago
AndrMod • 0

I'm using FarmCPU on a rather small samle size (119-162 individuals) and I get strikingly different results depending whether I use the PCs as covariates (option nPC.FarmCPU = 3).

With the PCs the algorithm fails to add QTN to the model and proceeds to test each locus individually. The QQ-plots show a stong deviation from normality in the p-values, and the genomic inflation is above 1. The results are >1000 significant SNP with q-value < 0.05, most of which are in linkage even though afar from each other: each of these groups have the same genotype and have been attributed the same effect size. I noticed that, if I perform a PCA (genotypes : rows; individuals: comumns), each group cluster together in the "individual space".

Without the PCs the algorithm adds QTN to the model of almost all the traits. QQ-plots and genomic inflation are well behaved, and there are about 3-10 significant SNP with q-value < 0.05.

I've tried to run also a mixed linear model, but in that case there is too little power to detect signals.

I'm quite puzzled. PCs should control for population structure and rather reduce the false postives in an analysis. I am looking at the distribution of the phenotypes t see whether there is something to control or improve, but I suspect that theiur includion may remove a part of the very signal to detect. Are there are similar cases or stances in using PCs had had this effect?

q-value covariates genomic-inflation pca gwas • 284 views
ADD COMMENT
1
Entering edit mode

If you are doing PCA to capture population stratification, you should not perform PCA only on your data, but project your samples onto a pre-computed PCA space.

ADD REPLY

Login before adding your answer.

Traffic: 3945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6