Question

Multiple hypothesis correction with multiple gene sets

0

Entering edit mode

3.8 years ago

suragnair • 0

I've multiple gene sets (N) that are not necessarily disjoint. Typically, performing GO enrichment using any off-the-shelf tool provides raw and adjusted p-values for each gene set. My understanding is that no further correction is typically applied and adjusted p-values per input gene set are reported.

However, this does not correct for the fact that we are testing N different gene sets. When displaying terms from all gene sets in a figure panel, the corrected values would be inflated in the context of the entire panel. Would it be more appropriate to instead correct raw p-values pooled across all input gene sets together? What's the preferred approach?

gene-set-enrichment GO multiple-test-correction GO-enrichment • 1.4k views

ADD COMMENT • link updated 3.8 years ago by Gordon Smyth ★ 8.1k • written 3.8 years ago by suragnair • 0

score 0 · Answer 1 · 2021-09-09

0

Entering edit mode

3.8 years ago

Gordon Smyth ★ 8.1k

If a GO term enrichment tool provides adjusted p-values, then they are adjusted for the number of gene sets. That's the whole point of the adjustment.

ADD COMMENT • link 3.8 years ago by Gordon Smyth ★ 8.1k

0

Entering edit mode

Yes, they are adjusted for the number (M) of annotated gene sets (e.g. GO:BP gene sets). They are not adjusted however for me running the enrichment tool multiple times with N different input gene sets, which is a total of NxM tests.

ADD REPLY • link 3.8 years ago by suragnair • 0

0

Entering edit mode

As an example, I have partitioned genes into N sets based on their dynamics over temporal samples. I run the GO enrichment tool individually for each of these N sets, which gives adjusted p values per input gene set. However in the end, I'll present GO terms across the N peak sets in one figure. Without any further correction, if we use FDR correction per input gene set, the FDR of the entire panel would be greater than that of an individual gene set.

ADD REPLY • link 3.8 years ago by suragnair • 0

1

Entering edit mode

You could apply p-value adjustment across all NxM p-values, as you suggested in your question. That's valid but probably conservative. I don't know of anything better.

ADD REPLY • link 3.8 years ago by Gordon Smyth ★ 8.1k