Publicly Available ChIP-seq Samples Does not Overlap
1
1
Entering edit mode
3.0 years ago
buffealo ▴ 130

Hello,

I want to ask a very general question. I am conducting ChIP-seq analysis. I know that every dataset is unique and has its own properties. Parameters during ChIP-seq is needed to adjusted to the datasets. However, I have spent a long time to understand and optimize the method of optimization . Now when I conduct the analysis on a dataset published in nature, the biological raplicates almost does not overlap at all. This is not a difference slight but it is like I am applying a wrong analysis pipeline for it. I am not sure where am I missing. Is there anyone facing the same problem and maybe advice me. Thank you.

publicdata chipseq overlappinpeaks chipseqoverlap • 957 views
ADD COMMENT
0
Entering edit mode

Can you clarify:

  • are you saying that you took the raw data for multiple samples from the paper, did you own analysis and (say) replicates one and two from this analysis don't overlap.
  • Or that you took raw data from a paper and did your own analysis, and the results from your analysis don't overlap with the results from the analysis done in the paper at all
    • or that you took processed data for multiple replicates from the paper, and (say) replicate 1 from their analysis doesn't overlap at all with replicate 2 from their analysis.
ADD REPLY
0
Entering edit mode

Hello, I am doing the second one. I have taken the dataset from the analysis. And my results do not overlap with the results in the article. Thus, normally we expect biological replicates of a condition in chip-seq samples overlap for a larger fraction (compared to my venn diagram shown here). However, my results look like this.

enter image description here

It looks like I took samples treated with distinct conditions and expect them to overlap. However, these are biological replicates.

I hope I clearly explained my situation. Thank you.

ADD REPLY
0
Entering edit mode
3.0 years ago

There are some excellent tools in deeptools (available on cmd line or within galaxy) which are excellent for finding the correlation and divergence between datasets (eg bigwig, bedg, bam etc).

Try those on for example, the raw mapped bams for all replicates and maybe report back for more opinions ?

In general though, the raw data behind many published papers are very flaky in my experience.

ADD COMMENT

Login before adding your answer.

Traffic: 2530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6