Entering edit mode
16 months ago
Apex92
▴
320
Dear all,
I have two gene sets and I want to see if the amount of overlap between these two sets is significant using binomial statistics.
I came up with this approach in r but it does not give significant p-value however based on the hypergeometric test (assuming in total I have 10,000 as the background set) I get a significant p-value.
# Parameters
n_A <- 90 # Number of genes in Set A
n_B <- 2588 # Number of genes in Set B
k <- 37 # Number of overlapping genes
#probability of overlap
p <- n_A / n_B
p_value <- 1 - pbinom(k - 1, n_B, p)
print(paste("Calculated p-value:", p_value))
How to resolve this?
Another question is, is it important that n_B should always be bigger in the binomial distribution test?
Thank you in advance.
Thank you for your comment. So based on the thread you shared, I assume I can calculate the p-value as:
Is that correct? And it does not matter if the
n_A
is bigger or smaller thatn_B
right?