How to calculate the absolute p-value for GSEA which returned 0.0 as the p-value?
2
3
Entering edit mode
2.8 years ago
Riq ▴ 50

I am trying to perform a GSEA for ten biochemical properties to check which properties are enriched for my set of genes. I am using the Windows application (v4.2.2) with GSEAPreranked package. However, when I perform the analysis some of them returns a p-value of "0.0". When I checked the user guide it states:

"In the GSEA report, a p value of zero (0.0) indicates an actual p value of less than 1/number-of-permutations. For example, if the analysis performed 100 permutations, a reported p value of 0.0 indicates an actual p value of less than 0.01. For a more accurate p value, increase the number of permutations performed by the analysis. Typically, you will want to perform 1000 permutations (phenotype or gene_set). (If you attempt to perform significantly more than 1000 permutations, GSEA may run out of memory.)"

I tried increasing the permutation to 10,000 and 100,000 which still yielded a "0.0" value. Is there a way to calculate the absolute p-value which I can use to later calculate FDR q-value?

Enrichment Gene Analysis p-value GSEA Set • 3.2k views
ADD COMMENT
4
Entering edit mode
2.8 years ago
LauferVA 4.5k

Hello Mahasish,

The answer to your question is implicit in the question itself.

GSEA is a form of permutation based testing. The p-value is calculated empirically by permuting the gene labels at random, then seeing how many times out of the total number of permutations a random permutation is more extreme than the actual result you generated.

example: if you run 10,000 perms, and the actual data has enrichment for a pathway more extreme than all but 13 of the permutations, then the p-value would be 13/10000 = 0.0013.

If you are generating a p-value of 0.0, this means that in 0 out of 10,000 cases was the random permutation more enriched than your actual data for that pathway.

There are two solutions I'd consider.

1) Record the p-value as p < 0.0001 2) Re-run the pathway analysis with 10,000,000 permutations. This will likely give you at least a few perms that are more enriched than your data.

If you choose 2), then I do agree with rpolicastro's comment, above: fgsea is the way to go.

Finally, are you only testing one pathway? If not, you will need to control for multiple testing.

VAL

ADD COMMENT
0
Entering edit mode

Thank you Vincent for your answer. When I increase the permutation to 1,000,000 the software cannot handle that many permutations and return an error "Java Out of Memory". I am testing for one pathway only. I have used "fgsea" package to calculate the really small p-values.

ADD REPLY
2
Entering edit mode
2.8 years ago

You may want to consider switching to fgsea which has algorithmic improvements to GSEA for better p-value estimation [PREPRINT]. It helps to resolve most cases where the p-value is really small.

ADD COMMENT
0
Entering edit mode

Thank you so much. "fgsea" was able to calculate the really small p-value for me.

ADD REPLY

Login before adding your answer.

Traffic: 2497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6