Replicates In Chip-Seq
3
1
Entering edit mode
11.3 years ago
Nick ▴ 290

I have the following dataset:

wild type:

  • 2 male biol ChIP replicates for TF A
  • 2 female biol ChIP replicates for TF B
  • 1 male ChIP sample for TF C
  • 1 male ChIP sample for TF D
  • 1 male input sample (from one of the animals used for one of the samples for TF A)
  • 1 female input sample (from one of the animals used for one of the samples for TF A)

knockouts:

  • 1 male ChIP sample for TF A
  • 1 pooled (male+female) ChIP sample for TF B

All animals are of similar age.

The main interest is the contrast between knockouts and wild types for TF A and TF B. I have the following questions:

(1) Does it make sense to take into account the samples for TF C and TF D?

(2) Does it make sense to take into account the sex (no sex-specific is effect is specifically expected)?

(3) How to make the best possible use of this data and which tool would you recommend? I have used macs. I am also aware about DiffBind, MEDIPS, diffreps and DBChIP but haven't used any of them so any specific recommendation regarding a tool and a workflow (if more than one tool is to be used) is most welcome.

chip-seq replicates model • 5.0k views
ADD COMMENT
2
Entering edit mode
11.3 years ago
Ying W ★ 4.3k
  1. No
  2. This is easier to do in DiffBind than DBChIP but DBChIP has better way of estimating when no replicates (unless you use same dispersion for KD as WT)
  3. In my experience, macs2 does not give too many results. If you know R well I would go with counting (using bedtools) and playing around with some of the edgeR functions (trying things like normalizing with median or full library size and making MA plots). If this is all the same cell type then subtracting input might not be as important. If you want something simple to run, go with DiffBind or DBChIP but the former is better documented. Another program you might want to consider is MAnorm.
ADD COMMENT
1
Entering edit mode
11.3 years ago

My two cents:

  1. I don't see why you would take TF C and TF D into account for the TF A vs TF B contrast.

  2. You could try to take it into account and compare to a model where you don't. This should be fairly easy in a DESeq/edgeR-like method such as DiffBind or DBChIP.

  3. I would try DiffBind, make peak sets out of the various TF samples vs. input, encode as much information (which TF, knockout or not, sex, ...) as possible into a metadata table and try the GLM functionality in DiffBind.

ADD COMMENT
0
Entering edit mode
11.3 years ago
Nick ▴ 290

Is there a way to run DiffBind or any of the other tools without an input file for each sample?

I tried to use DiffBind but it seems to expect an input file for each sample. I have two such files for the wild type. I merged them and used the merged input sample as a parameter for macs to find out the peaks in the wild types. For the knockouts I used macs but without any input files. I tried to run DiffBind with a sample sheet in which I put as an input file for all wild type samples the same merged file but left the cells for the input files for the knockouts blank and DiffBind would complain about the missing input files. So I re-used the merged wild type input also for the knockouts. I don't feel good about it - is there a way to do differential binding analysis that takes into account replicates (I do have such for the wild type) but also tolerates the lack of any replicates or even input files?

I know this is not a good arrangement but this is not my data - I am just trying to tease out as much signal from it without doing anything improper.

ADD COMMENT
0
Entering edit mode

Actually DiffBind doesn't require controls/inputs at all, you can just leave them blank or not inlcude the bamControl column in the samplesheet...

ADD REPLY

Login before adding your answer.

Traffic: 1613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6