Hi everyone. I'm analysing the H3K9me3 ChIP from the ChIP experiment I've conducted. However, the bioinformatic parts were did by a labmate who understand programming languages but lack of experience in analysing ChIP-seq.
We are now very struggling in calling the differential enriched peaks of my H3K9me3 ChIP between 2 conditions. The peaks were called by MACS2 either using default or --broad. Other parameters are as below. However, after calling the differential enriched peaks using bdgdiff, I found those "differential peaks" called are not as what I expected.
Question 1: Is there any way we can call the differential peaks based on the length of the broad peaks? In this way, it can tell if the histone mark is differentially enriched over a wide region (such as gene body).
Question 2: Sometimes, ChIP signal is actually enriched over a broad region and having a similar intensity within that region as shown in IGV. However, bdgdiff still called out some narrow regions within the broad one in one of the conditions. I'm afraid that this kind of enriched peaks are not biologically meaningful as the histone enrichment is actually similar across the region. Is there any way to eliminate this kind of "false positive" differential enriched peaks? Thanks!!
I am not an expert on this, but I attended a class taught by the author of MMDiff and it looked great for differential peak calling. Maybe worth a try? The paper is here, the software is a bioconductor package.
Hi Fabio, thanks for your suggestion. However, MMDiff seems specific for calling sharp peaks e.g. TF peaks as from a review (https://academic.oup.com/bib/article-lookup/doi/10.1093/bib/bbv110#supplementary-data) I've read. By the way, what kind of ChIP-seq. did you analyse using this software?
Enlarged
I only played with the exercises they gave us in the class, which - if I well remember - were H3K4me3 data. It is possible that things stand as you say, this is beyond my knowledge, sorry.
Do you have replicates per condition ?
Hi geek, yes I've got biological replicates but they were pooled (probably during/ just before peak calling).
If you have replicates,
DiffBind
is a really useful R package that make differential peak calling pretty straightforward. I was really impressed with it. If you don't have replicates,MAnorm
is also useful for directly comparing two samples. Both will derive a consensus peak set to normalize and compare signal at these peaks across the samples/conditions/treatments. Sounds like they may be worth a look in your case.