Why my pvalue histogram doesn't have uniform distribution
1
0
Entering edit mode
6.3 years ago
afli ▴ 190

Hi my friends, I do a fisher exact test by R, because I think the treatment would not affect the counts and I expect a uniform distribution of pvalue, but the histogram show U shape, with the 0 and 1 show large numbers. The code is as follows, could you please tell me why? Thank you very much!

test<-read.table("sample_fisher_test.txt")
test<-test[rowSums(test[,3:4])>5,]
for(i in 1:nrow(test))
{x<-c(test[i,1],test[i,3],test[i,2],test[i,4])
dim(x)<-c(2,2)
test$pvalue[i]<-fisher.test(x)$p.value}
ggplot(test, aes(x = pvalue)) +geom_histogram(binwidth = 0.05, fill = "lightblue", colour = "black")
dev.off()

enter image description here

data is available at: https://de.cyverse.org/dl/d/D577D93C-F511-41EE-AC74-26E2B5203564/sample_fisher_test.txt

pvalue uniform distribution fisher exact test • 3.9k views
ADD COMMENT
3
Entering edit mode

Why do you think it should be uniform?

ADD REPLY
0
Entering edit mode

I've just modified the content, I expect it to be, maybe it actually not. I just cannot understant the U shape.

ADD REPLY
1
Entering edit mode

Your comment does not add any information. I personally have too little of a statistical background to formulate expectations about p-value distributions. You should ask yourself if your statistical knowledge is sufficient to do so. As this is a pure statistics question, you might consider to post it on StackExchange. If you do, you can enhance your chance of a good response by following the guildelines on How To Ask Good Questions On Technical And Scientific Forums, because right now, your question lacks any details on what the experimental setup was.

ADD REPLY
0
Entering edit mode

Thank you ATpoint, I made the post in a hurry just now, sorry for that. I will read the guidelines carefully and do better next time. And I will post this on stackExchange to see if I can get some help.

Aifu.

ADD REPLY
1
Entering edit mode

Hi- See if this blog post helps you http://varianceexplained.org/statistics/interpreting-pvalue-histogram/ . To get better answers, it would be good to give some background about what you are testing as the U-shape may or may not be anything to worry about.

ADD REPLY
0
Entering edit mode

Thank you dariober, I've already seen this post, it is clear but the solution it provides could not solve my problem.

ADD REPLY
1
Entering edit mode
6.3 years ago

The large number of p=1 observations is due to p-values being "rounded up". For example, if count=200 and each row/column sums to 100, the central {50, 50} {50, 50} table has a ~11.2% chance of being observed under the null hypothesis. This table corresponds to p-value=1; the adjacent {49, 51} {51, 49} and {51, 49} {49, 51} tables correspond to p-value ~0.888, etc.

To avoid this upward bias, you can use the "mid-p value". In the example above, the most-central table has a mid-p value of ~0.944: the center, instead of the upper end, of the probability interval it corresponds to. The mid-p value has the nice property that, under the null hypothesis, the Q-Q plot should stay near the main diagonal.

(The same things are true for the binomial test you asked about earlier.)

Incidentally, I posted JavaScript Fisher's exact test and binomial test calculators up at https://www.cog-genomics.org/software/stats several years ago; the FET includes an option for turning the mid-p adjustment on/off, if you want to see more examples of the difference it makes.

ADD COMMENT
0
Entering edit mode

Thank you chrchang523, that sounds good, I do this using your 'fisher_test' function, with midp correction, the high value in 1 is reduced to a low value, and the 0.95 bar is largely increased, could it be possible to be similar between 1 and 0.95 values?(I filter the counts more strictly, so the general value is lower than the original picture)

enter image description here

ADD REPLY
1
Entering edit mode

This is expected if your sample sizes are such that p-value 1 usually corresponds to mid-p value in the 0.95 bin.

ADD REPLY

Login before adding your answer.

Traffic: 689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6