Benjamini+ Hochberg multiple testing p.adjust R
4
1
Entering edit mode
6.4 years ago
dp0b ▴ 80

Hi,

Im just trying to get my head around Benjamini + Hochberg multiple testing using the p.adjust approach in R. So is the BH mutliple testing option (pval *no of tests)/rank ?

I have a dummy dataset and bonferroni multiple testing works as should

pvals = c(2.335810e-07, 4.820826e-07, 4.820826e-07, 5.807533e-07, 5.807533e-07,6.954857e-07)
pval2=as.numeric(pvals)
BONF = p.adjust(pval2, "bonferroni")
head(BONF)


[1] 1.401486e-06 2.892496e-06 2.892496e-06 3.484520e-06 3.484520e-06
[6] 4.172914e-06

However, when I try the Benjamini and Hochberg option the adjusted pvalues are all the same

BH = p.adjust(pval2, "BH")
 head(BH)
[1] 6.954857e-07 6.954857e-07 6.954857e-07 6.954857e-07 6.954857e-07
[6] 6.954857e-07

If there is any advice on what Im doing wrong it would be much appreciated or if anybody knows the exact formula p.adjust uses for BH as well.

Thanks

R p.adjust multiple testing p values • 19k views
ADD COMMENT
2
Entering edit mode
6.4 years ago

is the BH mutliple testing option (pval *no of tests)/rank ?

No, it involves the cumulative minimum. It's most useful to think of BH adjustment as subtracting out a uniform distribution of p-values from your distribution. That won't produce the actual p-values, but it's a more useful visual representation. Alternatively, the adjusted p-value is the excess significance after plotting the p-value vs. its rank.

Given a vector of p-values p of length n:

i = n:1  # The reverse rank order
o <- order(p, decreasing = TRUE)
ro <- order(o)
pmin(1, cummin(n/i * p[o]))[ro]

Coincidentally, that's the exact code that p.adjust() uses.

ADD COMMENT
0
Entering edit mode

p.adjust does not use your q.

I also don't understand your comment on subtracting out a uniform distribution, or the "excess significance".

The adjusted p-value is simply the minimum FDR under which the test is significant.

ADD REPLY
1
Entering edit mode

It looks like I had the BY method rather than BH, I'll update the answer to correct that.

Regarding subtracting out a uniform, that's one of the ways of conceptually understanding how BH works.

ADD REPLY
1
Entering edit mode
6.4 years ago

Have you tried reading the paper ? The procedure is described in section 3. What it says is:
- Order the n p-values (smallest first)
- For a desired false discovery rate threshold alpha, compare p-value p(i) to alpha * i/n
- Find k as the largest i satisfying p(i) <= alpha * i/n
- Reject all null hypotheses p(i) for all i such that p(i)<=p(k)
Alternatively, the corrected p-value can be computed as min(p(i)*n/i, corrected.p(i+1)) for i from n-1 to 1

ADD COMMENT
1
Entering edit mode
6.4 years ago
Collin ▴ 1000

TL;DR There is nothing wrong with the output from the p.adjust function using the "BH" method. If you had more p-values that were not as highly similar, the results would generate different "adjusted p-values" from the "BH" method. You can see the exact calculations performed by p.adjust by simply typing "p.adjust" without parentheses into the R terminal.

What you're observing is a by product of the p.adjust package using a cumulative min when you specify the BH procedure. The original BH procedure (https://www.jstor.org/stable/2346101) doesn't actually specify an adjusted p-value corresponding to each p-value, rather it specifies a criterion to reject the null hypothesis:

[Equation 1 in paper]
let k be the largest i for which P(i) <= i/m * q^*
then reject all H(i), i=1,...,k

where P(i) is the i'th smallest p-value in your vector, H(i) is the i'th hypothesis, m is the number of tests, and q^* is the the desired false discovery rate.

Note that, as you point out, q = P(i) / (i / m) is basically the equivalent of an "adjusted p-value" for the BH procedure. Normally you just reject all null hypotheses that have q less than some value. Except there is a problem. q is not necessarily monotonic as you get to larger p-values. This could result in not calling all hypothesis as significant, as specified in the BH procedure to reject all H(i), from i = 1, ..., k. The fix in the p.adjust function was to use a cumulative min so that the rejection of null hypotheses will be the same as specified in the original paper.

ADD COMMENT
0
Entering edit mode
6.0 years ago
Renesh ★ 2.2k

Read this article about how to adjust P-values using Bonferroni and Benjamini and Hochberg in multiple testing https://reneshbedre.github.io/blog/mtest.html

ADD COMMENT

Login before adding your answer.

Traffic: 1817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6