If I have two types of Transciption factor binding regions say A n B
suppose A binding regions =5000 B binding regions=10000 overlaps observed=500 How to calculate the significance [No peak data is available ]
If I have two types of Transciption factor binding regions say A n B
suppose A binding regions =5000 B binding regions=10000 overlaps observed=500 How to calculate the significance [No peak data is available ]
This is a duplicate of How Do You Calculate If Two Sets Of Genomic Regions Overlap Significantly? where some answers are given. For completeness:
Here is a review on some resampling methods (the bioconductor packages is still not released) http://www.biomedcentral.com/1471-2105/11/359/
The encode GSC (genome structure correction) from the ENCODE trial project is still not published independently in a paper but some python source code is available here (the original was in matlab, but could be made to run in octave): http://www.encodestatistics.org/svn/genome_structural_correction/python_encode_statistics/trunk/
You can not answer this without peak data being available, because the size of the peaks is inverse proportional to the significance of the overlaps. The larger the peaks the more likely that the two overlap by chance. Mappability/Detectability is also crucial.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Ok using any permutation test using R [like coccur package] while going without peak data....from where to strt as m beginner at R