Entering edit mode
6.3 years ago
Bayram Sarilmaz
▴
50
I have two gene sets: A and B. I would like to check which genes in B are enriched in A. As a result of the enrichment analysis, I to have a p-value for each gene in B.
Here is a reproducible example that you can use: I'm performing the analysis of gene IDs.
A = data.frame(c(100,200,300,400,500,600,700,100,800,900,1000,100,500,100)) #Gene IDs in set A
B = data.frame(c(200,4,900,100,6)) #Gene IDs in set B
#check if B geneIDs are enriched in set A, and generate a p-value for the enrichment of each gene
go.obj <- newGeneOverlap(B,A)
go.obj
go.obj <- testGeneOverlap(go.obj)
print(go.obj)
In the above example I attempted using GeneOverlap package, but it didn't give me p-values for every gene in B. Any suggestions on other methods to achieve what I'm aiming for?
You cannot generate per gene p-values for enrichment, only a "set" level enrichment, i.e. does set B overlap with set A more than expected? Maybe you could elaborate on your question/goal? To me, "I would like to check which genes in B are enriched in A", means which genes in B are also in A....
Thanks for explaining this! Then if the p-value from the above geneoverlap test is equal 0, it means there is no significant overlap between A and B?
You should question P values that are equal to zero, particularly when you're dealing with just 5 elements in your B object and when a visual inspection reveals that only three-fifths of B form a subset of A. Why not just report that, i.e., that 60% of B overlaps with A? Why do you need a P value when human eyes are sufficient? Presumably your actual gene lists are much larger?
See @Kevin's comment below about meaningfulness of your p-value. But to your direct question, No a p-value of 0 would mean that the overlap is statistically significant. I.e. there is a 0% chance that you would find this degree of overlap or greater if there was no association between the two lists.