Question

How to input covariates in GEMMA?

4

Entering edit mode

7.1 years ago

maya123z ▴ 110

I'm new to GWAS and I've been trying to perform my analysis based on what's described in this paper, since the nature of my data is similar to theirs. So far I have cleaned my genotype data and then used GCTA to derive the top five principle components. Now I'm trying to use GEMMA to fit a linear mixed model, with the five principle components included as covariates.

The covariate file is where I'm stuck. The GEMMA manual provides an example on page 14 for five individuals with three covariates. It looks like this:

However I'm confused as to what the numbers in this example actually mean and how I can derive them. The manual says that the first column of 1's indicates that the intercept should be included, but what do the other two columns mean? The output from GCTA gave me the top five principle components as an "eigenvector" file and an "eigenvalue" file. Which of these would I use to generate the covariate file for GEMMA and how would I go about doing this?

Edit: I noticed in the manual that you can include eigen value/vector files instead of a relatedness matrix. Is this what they mean by including the top pc's as covariates?

gwas gemma gcta pca • 5.8k views

ADD COMMENT • link 7.1 years ago by maya123z ▴ 110

score 5 · Accepted Answer · 2018-03-22

5

Entering edit mode

7.1 years ago

maya123z ▴ 110

I ended up contacting the GEMMA email list directly, so I figured I'd answer my own question in case anyone else runs into this problem down the road. The answer is that from the eigenvector file that GCTA outputs, you'll first need to remove columns 1-2 (containing individual/family ID's) and then add a new column 1 containing only a string of 1's. This makes it compatible with GEMMA. Then save as a .txt file and input as your covariates file using the -c option. Hope this is helpful to others!

ADD COMMENT • link 7.1 years ago by maya123z ▴ 110

0

Entering edit mode

If thus, however,I was wondering how the sample ID of your eigenvector file match the downstream analysis of GEMMA, in another word, how GEMMA recognizes the order as the sample-wise relateness. I ask partly due to lack of deep insights into the mechanism of internal implementation of GEMMA, Thanks!

ADD REPLY • link 7.1 years ago by tanklovemermaid • 0

0

Entering edit mode

My understanding is that the covariate file must be in the same order as your phenotype file. In other words, the first row of the eigenvector file corresponds to the first individual in the phenotype file, and so on.

ADD REPLY • link 7.1 years ago by maya123z ▴ 110