Hi, which is the mathematical formula behind the --linear association used by plink ?
Hi, which is the mathematical formula behind the --linear association used by plink ?
The most basic association test is just a Chi-squared test on a 2 x 2 contingency table of the minor allele tallies, as to which I elaborate here: A: SNP dataset and Z Score
Any other test, such as linear / logistic regression, family-based tests, etc., are a mixture of again using minor allele tallies or genotypes encoded categorically (REF
, HET
, HOM
) with different assumptions about inheritance patterns.
Perhaps focus on the mathematics of these specific tests outside of PLINK as opposed to finding the exact formulae within the PLINK documentation itself. PLINK just re-uses already-published statistical tests.
Hi Kevin. I think we often see three genotypes (AA, Aa, aa), in which case we should have a 2*3 contingency table for a case-control study. Could you please explain how to do a chi-square test for this? My understanding is that for a 2*3 contingency matrix, the degree of freedom should be 2. However Plink still uses 1 df, which confuses me.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Please add some detail on what you're tried on your own to understand this. Have you read the plink paper(s)? Do you have a specific question? Did the papers/the manual mention anything about the
--linear
operation?Hi, yes I read the plink paper (link), but in the section "association" I didn't understand when exactly talks about the --linear option. Because it talks about tests in general with a wide variety of formula. I can't be able to link the --linear option with the respective formula.