Question

Significance Analysis Of Chip Seq Overlaps

0

Entering edit mode

11.9 years ago

revival786 ▴ 50

If I have two types of Transciption factor binding regions say A n B

suppose A binding regions =5000 B binding regions=10000 overlaps observed=500 How to calculate the significance [No peak data is available ]

statistics • 4.5k views

ADD COMMENT • link updated 11.9 years ago by Ido Tamir 5.2k • written 11.9 years ago by revival786 ▴ 50

score 1 · Answer 1 · 2012-12-21

This is a duplicate of How Do You Calculate If Two Sets Of Genomic Regions Overlap Significantly? where some answers are given. For completeness:

Here is a review on some resampling methods (the bioconductor packages is still not released) http://www.biomedcentral.com/1471-2105/11/359/

The encode GSC (genome structure correction) from the ENCODE trial project is still not published independently in a paper but some python source code is available here (the original was in matlab, but could be made to run in octave): http://www.encodestatistics.org/svn/genome_structural_correction/python_encode_statistics/trunk/

You can not answer this without peak data being available, because the size of the peaks is inverse proportional to the significance of the overlaps. The larger the peaks the more likely that the two overlap by chance. Mappability/Detectability is also crucial.