Question

Publicly Available ChIP-seq Samples Does not Overlap

1

Entering edit mode

3.0 years ago

buffealo ▴ 130

Hello,

I want to ask a very general question. I am conducting ChIP-seq analysis. I know that every dataset is unique and has its own properties. Parameters during ChIP-seq is needed to adjusted to the datasets. However, I have spent a long time to understand and optimize the method of optimization . Now when I conduct the analysis on a dataset published in nature, the biological raplicates almost does not overlap at all. This is not a difference slight but it is like I am applying a wrong analysis pipeline for it. I am not sure where am I missing. Is there anyone facing the same problem and maybe advice me. Thank you.

publicdata chipseq overlappinpeaks chipseqoverlap • 957 views

ADD COMMENT • link 3.0 years ago by buffealo ▴ 130

0

Entering edit mode

Can you clarify:

are you saying that you took the raw data for multiple samples from the paper, did you own analysis and (say) replicates one and two from this analysis don't overlap.
Or that you took raw data from a paper and did your own analysis, and the results from your analysis don't overlap with the results from the analysis done in the paper at all
- or that you took processed data for multiple replicates from the paper, and (say) replicate 1 from their analysis doesn't overlap at all with replicate 2 from their analysis.

ADD REPLY • link 3.0 years ago by i.sudbery 20k

0

Entering edit mode

Hello, I am doing the second one. I have taken the dataset from the analysis. And my results do not overlap with the results in the article. Thus, normally we expect biological replicates of a condition in chip-seq samples overlap for a larger fraction (compared to my venn diagram shown here). However, my results look like this.

enter image description here

It looks like I took samples treated with distinct conditions and expect them to overlap. However, these are biological replicates.

I hope I clearly explained my situation. Thank you.

ADD REPLY • link 3.0 years ago by buffealo ▴ 130

score 0 · Answer 1 · 2021-11-09

There are some excellent tools in deeptools (available on cmd line or within galaxy) which are excellent for finding the correlation and divergence between datasets (eg bigwig, bedg, bam etc).

Try those on for example, the raw mapped bams for all replicates and maybe report back for more opinions ?

In general though, the raw data behind many published papers are very flaky in my experience.