Question

bar chart to show a p value of 0

0

Entering edit mode

17 months ago

Penny • 0

Hello,

I am trying to use vertical bar chart to show the adjusted p value generated from GSEA. The x-axis is -log(padj) and y axis represents the enriched pathways. However, some of my enriched pathways have an adjusted p value of zero, which can not be converted into -log. I am wondering if there is a conventional way to show these pvalues in bar charts? Many thanks.

value chart bar p GSEA • 2.1k views

ADD COMMENT • link updated 17 months ago by Gordon Smyth ★ 7.7k • written 17 months ago by Penny • 0

0

Entering edit mode

you may give a very small p-value say 10e-10 for those ones for the purpose of plotting and change the y-label of 10 to Inf

ADD REPLY • link 17 months ago by Ming Tommy Tang ★ 4.5k

0

Entering edit mode

many thanks!

ADD REPLY • link 17 months ago by Penny • 0

1

Entering edit mode

17 months ago

dsull ★ 6.9k

The adjusted p-value is not 0 -- it's simply too low to calculate it or represent it. If a p-value < 10^-10, it doesn't make a difference whether it's 10^-13, 10^-100, 10^-4000, or what have you. You're rejecting the null hypothesis regardless (and the p-value isn't an effect size anyway).

If the p-value displayed is 0, just make it the lowest value by rounding it to something like 10^-10, and then show your plot.

ADD COMMENT • link 17 months ago by dsull ★ 6.9k

0

Entering edit mode

thank you so much

ADD REPLY • link 17 months ago by Penny • 0

score 3 · Accepted Answer · 2023-06-25

3

Entering edit mode

17 months ago

Gordon Smyth ★ 7.7k

No GSEA software should return exact zero p-values when correctly programmed, but the real problem is that extremely small p-values or FDR values from GSEA are statistically bogus. Any GSEA method that correctly accounts for correlation between genes will not return such unrealistically small p-values.

ADD COMMENT • link 17 months ago by Gordon Smyth ★ 7.7k

0

Entering edit mode

Thank you for replying to both of my questions. I was mistaken and it should be the FDR q-value in my GSEA report is 0. I used the Broad Institute GSEA software. Is the FDR of 0 acceptable?

ADD REPLY • link 17 months ago by Penny • 0

1

Entering edit mode

Yes, I already guessed that you were refering to FDR from GSEA. I also guess that you are using pre-ranked GSEA instead of the original (highly respected but conservative) GSEA method.

Is this acceptable? Well, pre-ranked GSEA is a frequently used method, so many people apparently find it acceptable. My results however show that it gives wildly inflated significance levels. Getting FDR=0 in particular is unnecesary, unhelpful and invalid. I have written about this in a few forums:

ADD REPLY • link 17 months ago by Gordon Smyth ★ 7.7k

0

Entering edit mode

Exactly, I used pre ranked GSEA. I ranked the genes using the results from limma. I am a newbie in this field, and yes , because I saw people used this method in their papers and I just followed their way. Now I think I need to carefully read your posts. Many thanks! I am so glad that I can get instructions from an expert.

ADD REPLY • link 17 months ago by Penny • 0

0

Entering edit mode

Gordon Smyth - do you have a code snippet that illustrates an implementation of the original GSEA method that you refer to (in R)? I looked, for instance, here, https://gksmyth.github.io/software.html but could not find a relevant snippet.

TYVM!

ADD REPLY • link 17 months ago by LauferVA 4.5k

2

Entering edit mode

The Broad Institute's published GSEA method is not available in R, and it would be much more than a code snippet if it was. I have discussed previously on the Bioconductor Support site why GSEA does not have any satisfactory implementation in R:

Implementation of the "original" GSEA algorithm in R

On the other hand, limma and edgeR provide a range of gene set testing and GSEA methods that account for inter-gene correlation and offer much more flexibility than the original GSEA method. cameraPR() is a replacement for pre-ranked GSEA and camera() or romer() are replacements for GSEA.

The limma and edgeR case studies provide examples of GSEA analyses, for example:

http://bioconductor.org/packages/release/workflows/vignettes/RNAseq123/inst/doc/limmaWorkflow.html (Section 7)
https://bioconductor.org/packages/release/workflows/vignettes/RnaSeqGeneEdgeRQL/inst/doc/edgeRQL.html (Camera gene set enrichment analysis)

Another possibility would be Efron and Tibshirani's GSA method (implemented in the CRAN package GSA). That is more powerful than the original GSEA method but still limited to two-group comparisons. camera() or cameraPR() on the other hand give much more flexibility to adjust for batch effects, sample quality, repeated measures and so on.

ADD REPLY • link 17 months ago by Gordon Smyth ★ 7.7k

0

Entering edit mode

Gordon - Thank you so much for taking time out. In this case, it was the plethora of answers to sort through that was the difficulty, rather than the paucity (that led me to be unsure if I'd missed anything).

At any rate, this response really helps me to frame the many posts you've made on Camera, CameraPR, etc. on Bioconductor Fora, I appreciate it and will proceed as you indicate.

ADD REPLY • link 17 months ago by LauferVA 4.5k