Question

Set-based test - PLINK

0

Entering edit mode

9.8 years ago

aleix.arnau1990 ▴ 10

Hi,

Plink documentation explains two options when running a set-based test analysis.

HINT Two extremes are to perform a test based on a) the best single SNP result per set:

--set-max 1

--set-p 1

or to use all SNPs in a set:

--set-max 99999
--set-p 1
--set-r2 1

Could someone tell me what are the hypothesis that are you testing in each case?

I have several sets of SNPs (each set is a gene of interest with lots of SNPs) but I don't know which hypothesis are you testing when applying --set-max 1 or --set-max 9999

I imagine that if I want to know if a gene is significantly associated with my disease I should take all SNPs. In that case it gives me the average STAT from all them and calculate the EMP1. But maybe in that case I don't get any significant result when I may could get significant results testing just the best SNP for each gene. So what is the difference? Which hypothesis are you testing when testing for all SNPs or just the best one in each gene?

Thanks a lot!

PLINK • 3.2k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by aleix.arnau1990 ▴ 10

Ram · Answer 1 · 2016-02-01

But maybe in that case I don't get any significant result when I may could get significant results testing just the best SNP for each gene.

I think it all depend on the genetic architecture of your phenotype. If you have multiple variants within a gene/pathways which, individually, lack statistical power (due to low frequency or a modest effect on your phenotype), testing a set may be preferable.