I'm doing a Genome Wide Association Study (GWAS) in R. I have SNP genotype data for 300 individuals. I have a total of 177,000 SNPs.
Before diving into the GWAS, I want to adjust for population stratification by doing a PCA analysis. I need to do a PCA that identifies the say top 10 principal components (PCs) and use them as covariates in the association analysis. Do you know of any R packages or software that will enable me to do this seamlessly?
I know about EIGENSTRAT that is implemented as part of the EIGENSOFT software. Has anyone used it before? I downloaded the software last night. Is there a tutorial on how to load/import the 177,000 SNPs into the software and do the PCA analysis. Do the SNPs need to be in a special format? Once I have the data, is it possible to import it into R? Do I need to write a wrapper to call the software? Any tips or information will be helpful. Thanks
good suggestion. thanks, I'll take a look at the manual for it and get back to you if I get stuck