probability of observing overlap between sets of genes
0
0
Entering edit mode
5.3 years ago

Imagine there are two species, A and B. A has 560 genes and B has 101 genes. 66 genes are in common between two species. After doing differential expression analysis we observed that there 18 DE genes for species A and 5 DE genes for species B, and 2 DE genes are in common between species. Our question is whether this overlap of two genes is more than expected by chance or not (does it make sense)?

What I do is to calculate the probability of getting overlap of 2 or more when randomly selecting 18 and 5 elements from two lists with 560 and 101 elements. In R it is :

library(lattice)

simulate_de_rate<-function(n_iter){
  result_vector<-NULL
  for(n in c(1:1000)){
    n_matched<-0
    i<-1
    while (i<=n_iter){
      a<-sample(c(1:560),18)
      b<-sample(c(-45:66),5)
      if (length(intersect(a,b))>=2){
        n_matched<-n_matched+1
      }
      i<-i+1
    }
    rate<<-n_matched/n_iter*100
    result_vector<-c(result_vector,rate)
  }
  print(result_vector)
  histogram(result_vector)
}

simulate_de_rate(1000)

So the probability of getting this overlap is very low, ~0.3%. Is it valid to say that the result that we see is not by chance? Can you suggest a statistically more rigorous way of calculation?Thanks

R • 1.2k views
ADD COMMENT
2
Entering edit mode

You probably want to take a look at the geneOverlap package.

ADD REPLY
0
Entering edit mode

Thanks, more than enough to figure out how to do it correctly :)

ADD REPLY

Login before adding your answer.

Traffic: 2697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6