Output

Question

Interpretation of Overlapping p-value = 0 and Odss ratio = Inf

1

Entering edit mode

6.9 years ago

Bayram Sarilmaz ▴ 50

I would like to conduct an enrichment analysis between two lists using GeneOverlap package in R. I am comparing gene IDs from each list, and there is no NAs in there.

This is the code I'm using:

go.obj <- newGeneOverlap(listA$ID,listB$ID)
go.obj
go.obj <- testGeneOverlap(go.obj)
print(go.obj) #Fisher's exact test

Output

Detailed information about this GeneOverlap object:
listA size=2765, e.g. 56163 60385 79882
listB size=9204, e.g. 9604 55585 56163
Intersection size=2765, e.g. 56163 60385 79882
Union size=9204, e.g. 56163 60385 79882
Genome size=23000
# Contingency Table:
      notA  inA
notB 13796    0
inB   6439 2765
Overlapping p-value=0e+00
Odds ratio=Inf
Overlap tested using Fisher's exact test (alternative=greater)
Jaccard Index=0.3

I need help to understand the resulting p-value (=0) and odds ratio (=Inf). Is this as a result of something wrong in my input data? or does those results have a meaningful statistical interpretation?

r GeneOverlap odds-ratio fisher's-exact • 6.8k views

ADD COMMENT • link updated 6.9 years ago by Santosh Anand 5.8k • written 6.9 years ago by Bayram Sarilmaz ▴ 50

score 3 · Answer 1 · 2018-08-17

What is that you did not understand? You are testing for the overlap between listA and listB. And your listA is a subset of listB (all the elements of listA are in listB). This means that the overlap between them is highly significant => p-value = 0 (actually it is a very small number rounded to zero; you can check by running your own fisher test in R).

The odds ratio of this test essentially says if the lists are independent (odds ratio = 1). An odds ratio of infinity means that the lists are highly dependent (not independent), as one is contained in other.

does those results have a meaningful statistical interpretation?

oh yeas! your results are extremely significant for both p-value (0) and odds ratio (infinity). I did a fisher test at my end.

> x
      [,1] [,2]
[1,] 13796    0
[2,]  6439 2765
> fisher.test(x)

    Fisher's Exact Test for Count Data

data:  x
p-value < 2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 1604.22     Inf
sample estimates:
odds ratio 
       Inf

As you see, also the 95% Confidence Interval is extremely high (and doesn't contain 1) showing that the lists are extremely significantly not independent (== dependent)