I have a question regarding PLINK's linear association test, namely how do you figure out the intercept or b0 value?
I have a quantitative trait along with a number of covariates that I ran a GWAS on using the following plink command.
plink --bfile mydata --linear --pheno pheno.txt --covar covar.txt --pheno-name height --covar-name sex,age,race
The results in 'plink.assoc.linear' looks like this for an example SNP
CHR SNP BP A1 TEST NMISS BETA STAT P
0 kgp22785392 0 A ADD 1726 -0.4155 -1.537 0.1244
0 kgp22785392 0 A sex 1726 5.59 0.7218 0.4705
0 kgp22785392 0 A age 1726 -21.09 -2.771 0.005643
0 kgp22785392 0 A race 1726 11.21 1.479 0.1393
Plink states that the regression equation would be something like this:
height = b0 -0.4155*ADD + 5.59*sex -21.09*age +11.21*race + e
As a sanity check, I just wanted to make sure I am understanding this equation properly.
My questions are:
1) How do I get b0? Is it just the mean height? 2) If b0 is the mean height, I assume you need to take the mean only over the values that have been included in the analysis. That is, you should only average over the same NMISS individuals that plink used. If this is the case, how do you get the exact NMISS? 3) If I plug in the values of ADD, sex, age, and race for one individual along with b0, I should be able to get the value for e. How do I check that the value I'm getting is correct?
EDIT:
I just ran a multiple regression in R and see that the values of the coefficients are very similar to the BETA values from PLINK, as expected.
R also gives the b0 value - curious why PLINK doesn't give this value?
I just added an option to output intercepts; using one of the builds posted at https://www.cog-genomics.org/plink2/ , replace "--linear" with "--linear intercept" and you'll get an additional "INTERCEPT" row for each variant. Let me know if you have any problems.