How to input covariates in GEMMA?
1
4
Entering edit mode
6.7 years ago
maya123z ▴ 110

I'm new to GWAS and I've been trying to perform my analysis based on what's described in this paper, since the nature of my data is similar to theirs. So far I have cleaned my genotype data and then used GCTA to derive the top five principle components. Now I'm trying to use GEMMA to fit a linear mixed model, with the five principle components included as covariates.

The covariate file is where I'm stuck. The GEMMA manual provides an example on page 14 for five individuals with three covariates. It looks like this:

1  1  -1.5
1  2  0.3
1  2  0.6
1  1  -0.8
1  1  2.0

However I'm confused as to what the numbers in this example actually mean and how I can derive them. The manual says that the first column of 1's indicates that the intercept should be included, but what do the other two columns mean? The output from GCTA gave me the top five principle components as an "eigenvector" file and an "eigenvalue" file. Which of these would I use to generate the covariate file for GEMMA and how would I go about doing this?

Edit: I noticed in the manual that you can include eigen value/vector files instead of a relatedness matrix. Is this what they mean by including the top pc's as covariates?

gwas gemma gcta pca • 5.3k views
ADD COMMENT
5
Entering edit mode
6.7 years ago
maya123z ▴ 110

I ended up contacting the GEMMA email list directly, so I figured I'd answer my own question in case anyone else runs into this problem down the road. The answer is that from the eigenvector file that GCTA outputs, you'll first need to remove columns 1-2 (containing individual/family ID's) and then add a new column 1 containing only a string of 1's. This makes it compatible with GEMMA. Then save as a .txt file and input as your covariates file using the -c option. Hope this is helpful to others!

ADD COMMENT
0
Entering edit mode

If thus, however,I was wondering how the sample ID of your eigenvector file match the downstream analysis of GEMMA, in another word, how GEMMA recognizes the order as the sample-wise relateness. I ask partly due to lack of deep insights into the mechanism of internal implementation of GEMMA, Thanks!

ADD REPLY
0
Entering edit mode

My understanding is that the covariate file must be in the same order as your phenotype file. In other words, the first row of the eigenvector file corresponds to the first individual in the phenotype file, and so on.

ADD REPLY

Login before adding your answer.

Traffic: 1237 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6