Question

Chip-Seq Cancer Cell Lines

1

Entering edit mode

5.3 years ago

zeeshan.fazal1986 ▴ 10

Hi We have done Chip-seq H3k27m3 data on cancer cell lines resistant to cisplatin. We have chips-eq of parent (non-resistant) cell lines with 3 replicates and 3 input replicates. Furthermore, we have chip-seq of resistant cell lines with 3 replicates and 3 input replicates.

Q1: After Alignment step, should I call the peaks separate for each input and experiment replicate and then i should find the common peaks between all the replicates of input and experiment?
Q2: Ultimately i want to find differential peaks between parent vs resistant cell lines. How can I achieve this? Should I use homer?

Thanks

ChIP-Seq • 1.4k views

ADD COMMENT • link updated 5.3 years ago by ATpoint 85k • written 5.3 years ago by zeeshan.fazal1986 ▴ 10

score 1 · Answer 1 · 2019-08-07

Q1: After Alignment step, should I call the peaks separate for each input and experiment replicate and then i should find the common peaks between all the replicates of input and experiment?

There isn't a standard approach that everyone takes. ENCODE has an IDR (Irreproducible discovery rate) pipeline to get a combined set of peak calls from replicates, but it's mainly for TF ChIP-seq. I've used it for narrow marks like H3K4me3 and it works ok, but I don't think you'll get much from it with a broad mark like H3K27me3. I've seen people just keep peaks with >50% overlap between replicates. I've also just seen people combine their alignments from all replicates into a single bam and do peak calling on that. I think the first method is more stringent, so you'll get less false positives but probably miss some true positives.

Q2: Ultimately i want to find differential peaks between parent vs resistant cell lines. How can I achieve this? Should I use homer?

You can just look at the peak calls and compare peaks that are called in one group and not called in the other. If you get interesting results from that, then great. Otherwise, there are various tools out there to look at differential levels of enrichment. Macs2 has a diffpeak operation, there's a fairly new tool called NormR that does differential enrichment. I've also seen papers that call peaks, get the read counts within each peak, and then do a standard DEG-like analysis using, for example, EdgeR or DESeq2.

score 1 · Answer 2 · 2019-08-07

I recommend using a dedicated peak caller that makes use of replicate information and inputs such as PePr. macs is fine but does not use replicate information. IDR after macs is possible for sharp peaks (so not HeK27me3) but by default it only accepts n=2 rather than n=3 so you would need custom approaches which is not necessary when using PePr right away. For differential analysis, PePr has a command for this as well which might be interesting if you have little or no experience with differential analysis software such as csaw/edgeR, DiffBind or DESeq2. The latter tools have outstanding documentation, go read them to get a background. I would prefer running PePr on both the conditions of your experiment and then merge the peaks, followed by creation of a count matrix which you can analyze, e.g. with csaw.

Alternatively, csaw has a strategy (read manual) that uses sliding windows to avoid the necessity of peak calling at all. I still prefer calling peaks separately as one often requires a list of peaks for other purpose like clustering etc.