Hi everyone!
As far as I understand it - when you do PCA the resulting eigenvalues should sum to the number of variables in the original dataset. I have a genetic dataset containing 3700 individuals and 111,867 variables.
When I do the PCA analysis in R, I can sum the eigenvalues and it equals 111,867, no problem.
However, when I do the PCA analysis in PLINK, my eigenvalues in the file plink.eigenval don't sum to anywhere near that number :/
I'm using PLINK/1.90beta, and using the --pca tag. I set it to return the maximum number of PCs (3,700) so I definitely have all the information.
Have I misunderstood how PCA is supposed to work? Or am I misunderstanding the output of PLINK?
Thanks in advance!!
Thank you, your answer help me a lot. This is what I did:
Following your answer, I run plink with:
Calculate sum of variance from relation covariance matrix:
Calculate percentage variance explained (pve) and write to file
Although I'm still vague about the exact formula, I seem to get the appropriate result for the percentage variance explained for each PC.