Hi,
I have recently been working on getting gene-based pvalues based on SNPs.
At first i performed an Armitage test on each SNP separately with association to a certain disease after that i preformed SIMES method on the pvalue to aggregate SNP pvalue to ine gene based pvalue .
After that i used SKAT method to calculate gene level pvalues .
I came across some results in which i would be happy for your help.
for example for the
DOC2A gene
i checkd in LDlink and found out the SNPs in this gene are in linkage equilibrium. I was wondering on which method should i choose to detrmin the pvalue of the gene since the simes method returns a significant pvalue and the SKAT method returns not segnificant pvalue ( the p values of each spn in the gene are segnificant ). does the SKAT method takes into account linkage equilibrium?
The simes meta-analyis procedure should only be applied to statistically independent hypothesis tests -- this is certainly not the case in the presence of LD.
For combining variants across genes, GCTA or LDAC can be used at the gene level, and the older Z-score based method, MAGMA, is still valid. The quickest-to-implement and dirtiest method -- selecting the minimum p-value and repeating on permuted phenotypes for a permutation test -- is also valid, so long as you are careful about any phenotype/covariate correlations or block designs.
SKAT is specifically a rare-variant association test, and depending on your settings when you ran it, it may have excluded large classes of variants such as those with frequency > 1%, non-coding, and/or synonymous-annotated mutations.
Not directly. However, SKAT like all burden tests operates by "collapsing" genic variants into a single score; this means that strong LD inflates scores, as individuals with one of the variants are highly likely to have multiple others.
LChart thank you . so just to clerify SKAT doeant use LD when calculating the pvalues ?
Not directly. However, SKAT like all burden tests operates by "collapsing" genic variants into a single score; this means that strong LD inflates scores, as individuals with one of the variants are highly likely to have multiple others.