I'm working on some ChIP-seq data where I want to measure differential binding between treatment and control conditions. In the treatment condition I see lots of binding, and in the control condition I see very little to no binding. I used DiffBind with both the DESeq2 and edgeR normalization methods to make MA plots of the signal from the peaks called in both the treatment and control conditions.
DESeq2:
EdgeR:
You can see that the normalization from DESeq2 appears to be more appropriate. I also looked into using the csaw package to call differential binding. However, one paragraph in the manual (page 25) worried me:
As an aside, the csaw pipeline can also be applied to search for "DB" between ChIP libraries and control libraries.
The ChIP and control libraries can be treated as separate groups, in which most "DB" events are expected to be enriched in the ChIP samples. If this is the case, the filtering procedure described above is inappropriate as it will select for windows with differences between ChIP and control samples. This compromises the assumption of the null hypothesis during testing, resulting in loss of type I error control.
Theoretically, is the inappropriate filtering/comparison between ChIP and Input which the manual describes similar to the comparison I'm making above? I have no binding in the control condition, and lots in the treatment condition? The regions were "filtered" not by large read count differences, but by prior peak-calling. More generally, if you are looking at differential binding/expression between conditions, and you see no binding/expression in your control, and lots in your treatment, are the assumptions of the null hypothesis compromised?
Is what you call your "control condition" the input of the experiment ? Or is it the ChIP of untreated samples ? If it is the second one, then don't worry : From what I understand, the warning only apply to filtering based on DB between IP and INPUT.
Yes, it is the ChIP of untreated samples.