Good morning,
I would like to ask a little question. I have to perform a GWAS. However, analyzing my data, I have found that there are 3 populations among my individuals and a lot of mestizos:
Going on with the analysis, I find a lot of false positives (analyzing a quantitative trait). How to deal with mestizos in GWAS? Is it a confounding factor, right?
Thanks.
I don't know how to create the covariate files...
The covariate file in PLINK looks like the phenotype file, here's a basic example with the covariates "AGE" and "ANCESTRY" and the family ID FAM001 and the individual IDs 1 and 2:
If you save this as "file.txt" and add the covariates in PLINK using
--covar file.txt
it will use all of these, to restrict the covariates to for example "age" use--covar-name AGE
Source
Are you sure it is a good idea to put the populations as covariates? Think that I have a lot of mestizos. Each of them has a different ancestral composition (0.2% of population 1, 0.5% of population 2, 0.3% of population 3). Their genotype may differ a lot even though having similar percentages for each population. So, I think that the only way to solve this is excluding mestizos.
If you think that you have subpopulations inside your mestizos, try using EIGENSTRAT instead to correct the alleles themselves for pop stratificiation, then you don't have to use the covariates.