Entering edit mode
4.0 years ago
Pappu
★
2.1k
I am wondering what is the difference between pooling 2 replicates and 2 IGG controls for ChIP-seq peak calling compared to running the peak calling separately and doing IDR to integrate the peaks.
IDR will give you those peaks somewhat consistent between replicates (and by this more reliable, probably preferred when looking for "true" binding locations) while merging just "takes it all" without the chance to make statements about reproducibility.
I agree. Maybe if you were having problems getting peaks due to being borderline on your adequate sequencing depth of the samples, or bad quality, etc. you could consider pooling the replicates to explore your results...
On top of the per-sample quality, what people often forget is that IDR requires that peaks are being called with relaxed settings, e.g. a p-value threshold (not q-value) of like 0.05 so very non-stringent. Reason is that the model requires both signal and noise and calling to stringently will remove much noise, making the model inadequate. You also have to sort data by the ranking metric, e.g. the -log10(p) field. This tutorial summarizes the essentials: https://hbctraining.github.io/Intro-to-ChIPseq/lessons/07_handling-replicates-idr.html
I agree. I saw pooling of 2 control replicates for peak calling in MACS2.