Question

Normalize coverage to spike-in chromatin

0

Entering edit mode

6.9 years ago

goodez ▴ 640

I have a number of samples that were sequenced (human cells) under some different conditions. All samples received the same 10% spike-in of mouse chromatin. I need to normalize coverage of each sample based on the number of mouse reads after sequencing.

I posted a similar but more complicated question a few days ago here, but I want to ask a more simplified question here that may actually get a response. As a basic example, let's say I have 4 samples:

Sample        |        Mouse reads                       
-------------------------------------
1             |         1.02 million
2             |         0.78 million
3             |          1.01 million
4             |          0.60 million

1) What are your ideas to scale each sample to the same level of mouse reads?
2) Does it make sense to first scale to RPM of human reads before scaling to mouse, or only scale based on mouse reads?

ChIP-Seq Normalization • 2.2k views

ADD COMMENT • link updated 6.9 years ago by Devon Ryan 105k • written 6.9 years ago by goodez ▴ 640

score 1 · Answer 1 · 2018-06-26

1

Entering edit mode

6.9 years ago

Devon Ryan 105k

The most robust method would be to run something like multiBamSummary bins on the mouse BAM files to get a table of counts. You can then load that in R and normalize it with DESeq2.
It depends a bit on what you want to do with the results, but ideally you should just scale based on the scale factors from above.

ADD COMMENT • link 6.9 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks, seems like a simple solution. Will try when I get a chance.

ADD REPLY • link 6.9 years ago by goodez ▴ 640

0

Entering edit mode

I need to use the bed-file mode so I can find coverage of specific genomic regions. Would I need to use the multiBamSummary --outRawCounts in order to load it into R? Or do you think I can use the .npz output?

ADD REPLY • link 6.9 years ago by goodez ▴ 640

0

Entering edit mode

Yes, definitely use the --outRawCounts option. You can load the .npz file, but you need special packages installed and it's generally more hassle than it's worth.

ADD REPLY • link 6.9 years ago by Devon Ryan 105k