As title, I am curious about how to do normalization between two Chip-seq data. When we are doing quantification analysis between two Chip-seq data, how can we know that the differences between two samples are due to the different condition?
Why I have this question is that I am currently reading a paper "Epigenetic Regulation of Learning and Memory by Drosophila EHMT/G9a "
The article mentioned that
To compensate for differences in sequencing depth and mapping efficiency among the two ChIP-seq samples, the total number of unique tags of each sample was uniformly equalized relative to the sample with the lowest number of tags (7,043,913 tags), allowing for quantitative comparisons.
I just don't get the point here that how the normalization is done.
Hello! Hijacking this post a bit. Since it is been almost 3 years this was asked I was wondering if people still do this, the total number of reads normalization? I m having a hard time comparing two ChIP-seq datasets (normalized by their input). One of the IP libraries is really big compared to the other one and I find that MACS is not showing all the peaks it should. So I guess I have two problems : the first one is one of the IP is not showing all the peaks it should and the second one is, how do I compare my two IP libraries if they do not have the same number of reads to start with?
I posted a question in this forum too in case of, with more details to my pipeline. Thanks!
Comparing two ChIP-seq libraries
Rita
it is not a good idea to post a new question in the answer section of an old question - we're not really a forum where threads go on and on - the value of the site is in having one question with answers following to that specific question. Moreover posting here makes the question a lot less visible and far fewer people will take note of it.
I have moved your answer (which was a question really to the comment section of the main post)
MAnorm: To circumvent the issue of differences in S/N ratio between samples, we focused on ChIP-enriched regions (peaks), and introduced a novel idea, that ChIP-Seq common peaks could serve as a reference to build the rescaling model for normalization. This approach is based on the empirical assumption that if a chromatin-associated protein has a substantial number of peaks shared in two conditions, the binding at these common regions will tend to be determined by similar mechanisms, and thus should exhibit similar global binding intensities across samples. This idea is further supported by motif analysis that we present.