Obtained too many chip-seq peaks after diffbind, how to set a criteria to select more "significant" genes?
1
0
Entering edit mode
4.6 years ago
yepeh72919 ▴ 10

Hello all!

I have conducted some ChIP-seq analysis using diffbind to compare 2 different conditions, and the number of peaks obtained are very large (e.g. ~49,000 peaks). There are some repeats of genes in the list of peaks, but they are from different regions of the gene. I would like to do some downstream analysis (e.g. gene ontology) , but the number of peaks are way too large.

I have the following questions:

  1. Should I conduct this kind of cutoff? The diffbind scores range from 1.5 to 6.
  2. Is there a way to set a cutoff for the number of peaks for downstream analysis?

Something I can think of: set an arbitrary cutoff for diffbind score. Scores > 3.5 are selected for the analysis.

Another thing i can think of: ratio of peak height for condition 1 vs condition 2. This way, I can then select genes with height >1.5 fold in condition 1.

If peak height is a good way to obtain more significant genes, what tools do you recommend?

Thanks!

ChIP-Seq diffbind • 1.4k views
ADD COMMENT
1
Entering edit mode

If number of significant regions is unexpectedly high be sure to use MA-plots in order to check if normalization is off-scale and many false-positives were produced. Actually one should always do that. Proper normalization should center the majority of regions somewhat at y ~ 0.

ADD REPLY
1
Entering edit mode
4.6 years ago
Rory Stark ★ 2.1k

Possible thresholds:

  • Lower the FDR
  • Also threshold on Fold Change (eg using the fold parameter to dba.report())
  • Just take the top n sites (sorted by FDR or Fold)
ADD COMMENT

Login before adding your answer.

Traffic: 1773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6