How To Compute The Significance Of The Overlap Of The Size Of The Intersection Of Two Gene Sets.
3
1
Entering edit mode
11.3 years ago
ChIP ▴ 600

Hi!

I have a genelist from a KEGG pathway say TGFb signalling pathway, that has 80 genes in it (set A). and I have second list of genes with me which accounts for 45 genes (set B), and the overlap between the two sets is of 15 genes.

How can I get a p-value associated with this overlap?

Has anyone done this before?

I think, phyper of R can be applied to this .......

Thank you

pathway statistics • 20k views
ADD COMMENT
0
Entering edit mode

Dear folks,

I wonder what method would fit best to compute the significance of the overlap between three groups of genes? For calculation the p-Values I used the Fisher's exact test. If I understand right, the Fisher's exact test is just practical for a comparison of two groups of genes, right?!

Thanks!

ADD REPLY
4
Entering edit mode
ADD COMMENT
2
Entering edit mode
11.3 years ago
Vikas Bansal ★ 2.4k

For a quick solution, look at this page.

A web based CGI scripts that computes the

Statistical significance of the overlap between two groups of genes

via the hypergeometric distribution/

ADD COMMENT
0
Entering edit mode
6.1 years ago
Vasei ▴ 30

You can use resampling (sometimes called permutation test) in this case. Assuming your gene list is L, you can sample A of size 80 and B of size 45 independently from L, and see if the size of intersection is more or less than 15. By repeating this many times you get an estimate of probability of an overlap of size as odd as 15 by the assumption of independence between A and B! If this probability is very small this may mean that the assumption of independence is not a good assumption.

ADD COMMENT

Login before adding your answer.

Traffic: 1985 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6