Testing overlap significance of three gene sets using the hypergeometric distribution
0
0
Entering edit mode
19 months ago
Pac314 ▴ 10

Is it appropriate to take a three-way intersection of three overlapping gene sets and use the phyper function in R to assess the significance of the overlap? I have used the hypergeometric distribution for pairwise intersections before but not for > 2 intersections.

Hypergeometric-distribution gene-sets • 1.2k views
ADD COMMENT
2
Entering edit mode

You're better off just simulating it. For 1000+ iterations randomly sample 3 gene sets of equal size to your original 3 gene sets and record the overlap length. Your observed overlap should be greater than 95% of these simulated overlaps.

ADD REPLY
2
Entering edit mode

+1 for the simulation mentioned by rpolicastro , and it its the typical approach. However also check out the SuperExactTest R package, which implements testing for intersections of multiple sets (it is also nice to analize all possible intersections in one go).

ADD REPLY
0
Entering edit mode

Thank you both for your answers! This library is perfect for my analysis. Thank you for sharing this!

ADD REPLY

Login before adding your answer.

Traffic: 2582 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6