Hello, I have tried followed procedure for polygenic risk score analysis with whole genome sequencing data and just want to be sure that there are any problem.
To find out the significant variants which might cause functional impact, I've only used the functional variants which alter protein seqeunce(Exonic) or variants known to reduce/increase gene expression(intronic).
I tried p-value thresholding for refining PRS candidates and 5e-3 of p-value cutoff showed the best performance for PRS calculation.
The variants used for calculating PRS seems to highly enriched to certain pathway, such as Hypoxia, so that i want to emphasizing those 'significant variants used in PRS calculation' are related to hypoxia.
The point what I worry about is below.
I know there is many methods for emphasizing functional variants such as prioritizing but I just chose to exclude non-functional variants. Does it seem acceptable?
Can I say that the variants, used for calculating PRS satisfying 5e-3 of p-value cutoff, are 'significant variants' without any p-value correction? Or, does it seem to make sense to say that 'hypoxia is related to the case phenotype' by utilizing enrichment test result of PRS candidate?
I would appreciate any comments or advice. Thank you in advance.
Thank you for your helpful comment. I actually calculated the enrichment score by using genes containing those functional variants (remained after LD clumping).
Really that's not detailed enough. You need to specified whether you ran GSEA or GSORA; and if GSEA you need to specify the gene universe and score, and for GSORA you need to specify what the background gene set is.
Where SNPs are involved, LDSC is pretty much the way to go. The older approach (MAGMA) can work in a pinch too; these tend to be much more conservative than enrichment statistics.
Thank you again for your comment and sorry for bothering due to my lack of knowledge. (I'm quite new to gwas analysis and have no one for asking advice arround) I just put the geneset of PRS candidate to enrichr for enrichment test and saw that the hypoxia pathway is highly enriched. Therefore, i wanted to say that since the most of significant variants (which satisfied the p value thereshold at least) are engaged in hypoxia, hypoxia may be associated with the phenotype. Any advice or comment would be appreciated. Thank you so much.