I have seen papers that have chosen to include 1,2,3,4,5,8, and 20 PCs as covariates in GWA studies. I have probably seen others that include a different number but was not aware of it at the time.
Some papers appear to do this based on the appearance of Scree plots, but more commonly no explanation is provided as to why a certain number were included.
Certain papers out of the Broad institute include as many as 20PCs, and do not provide a rationale for why.
My questions are:
1. What are the various rationales for including a certain number of PCs into a GWAS study?
2. Is the inclusion of more PCs regarded as conservative or anti-conservative? If so, why?
3. Are there good papers that explain this in a rigorous fashion in the context of genetic studies, controlling for ethnicity etc.?