Question

Normalization using spike-in chromatin

0

Entering edit mode

6.5 years ago

goodez ▴ 640

I have many human ChIP-seq samples which I need to compare. All of them received the same 10% spike-in of mouse chromatin. At first I was going to simply normalize each sample to the total number of spike-in reads. However I've been asked to do something a little more sophisticated now and need some advice.

I have used a method called Irreproducible Discovery Rate (IDR) to find ChIP-seq peaks (called by MACS2) that are very reproducible and high confidence. I need to only use these confident peak regions for my normalization. As in, I have counted up total mouse spike-in reads that fall into the confident peak regions for each sample. So for example for one antibody I have this scenario:

Sample                     Reads within confident peaks                         
Wildtype rep. 1                 1.02 million
Wildtype rep. 2                 0.78 million

Mutant rep. 1                     1.01 million
Mutant rep. 2                     0.60 million

What is the best way to normalize the samples in this case? And should I still be normalizing to RPM prior to the spike-in scaling factor? I am thinking probably I should skip the RPM step and just use the spike-in. Hopefully this question makes sense to someone. Thanks

ChIP-Seq Normalization Spike-in • 2.2k views

ADD COMMENT • link 6.5 years ago by goodez ▴ 640