Question

DiffBind to call differential binding of Super-enhancers from ROSE

0

Entering edit mode

5.1 years ago

Researcher ▴ 130

Hi All, I want to use bed files having list of super-enhancers identified in each samples generated from ROSE and to check the their differential binding. Towards this I have few questions: Can I use DiffBind for this? Is it possible to use DiffBind to read bed files and looking at their respective bam files to calculate their differential binding?

Have anybody done this before? Any suggestion will be highly appreciated.

Thanks

Super-enhancers ROSE DiffBind chipseq • 3.7k views

ADD COMMENT • link updated 5.1 years ago by sim.j.baum ▴ 160 • written 5.1 years ago by Researcher ▴ 130

score 3 · Answer 1 · 2019-10-29

3

Entering edit mode

5.1 years ago

GouthamAtla 12k

Super enhancer are nothing but collection of closely spaced individual "peaks" (less than 12.5kb I guess) in linear genomic space. So if you want to do a differential analysis, you could just perform a differential binding of all individual peaks and show an enrichment of differential peaks being a part of super-enhancers (Fisher's exact test if a differential peak is a part of super enhancer or not). Other than that, I could not think of a way of doing it. You could consider all the linear genomic space that span a super enhancer, but it will be very noisy and include a lot of background reads.

ADD COMMENT • link 5.1 years ago by GouthamAtla 12k

2

Entering edit mode

Speaking from personal experience, this is the best way to go about it. Trying to use the SE boundaries directly will not yield the results you want.

ADD REPLY • link 5.1 years ago by jared.andrews07 ★ 18k

5

Entering edit mode

I agree. In fact I would call peaks as normal, then take the peak summits and resize them to the average peak width which is typically < 1kb for H3K27ac. If windows overlap, merge and get a count matrix for the resulting genomic regions. Perform diff. binding as usual and then filter results for the SEs. Do not focus on SE as these make up only a small part of all peaks. Even though it has been shown that these regions are enriched near "important" genes in terms of cellular identity more recent ATAC-seq data also have shown that the actual open chromatin part of these SE stretches are distinct and of short size (if memory serves < 1kb) so it is questionable what these long stretches indeed are. I would always try to limit the peak sizes as much as possible to avoid large and inflated counts.

ADD REPLY • link 5.1 years ago by ATpoint 85k

0

Entering edit mode

Hi ATpoint, Thank you so much such a nice description. Can you please again explain which summit option have you just mentioned? Is it from MACS2 or DiffBind as both the tools have an option with the same name "summit" and I am not sure which one will be helpful for the cause.

Thanks again!

ADD REPLY • link 5.1 years ago by Researcher ▴ 130

0

Entering edit mode

I refer to macs2 and its either narrowPeak output (column 10) or the summit BED files. There is indeed a resizing option in DiffBind which you could use. Check its manual, I do not know the command by heart.

ADD REPLY • link 5.1 years ago by ATpoint 85k

score 2 · Answer 2 · 2019-10-29

2

Entering edit mode

5.1 years ago

venu 7.1k

You can pass custom count matrix to diffBind (check reference manual, page 3: Construct a DBA object).

For each sample make a bed file of super enhancers
calculate read counts from all your BAM files (using deepTools multiBamSummary function)
pass this to diffBind

ADD COMMENT • link 5.1 years ago by venu 7.1k

0

Entering edit mode

Hi venu, I am a bit lost and seeking your help. Actually I have 10 bed files from 10 samples, each one has a different start and end coordinates based on their SEs. But I am confused about making a common bed file from all these together in order to generate the read count matrix.

Before using the

 "multiBamSummary BED-file –BED selection.bed –bamfiles file1.bam file2.bam -o results.npz"

what should I use to make the common bed file (selection.bed):

bedops –intersect or 
bedops –everything or
bedops –partition or
bedops –merge or 
bedtools -intersect

Can you please explain it in more detail and help?

Thanks

ADD REPLY • link 5.0 years ago by Researcher ▴ 130

1

Entering edit mode

If you're using DiffBind, just set consensus=TRUE. It will derive a consensus peakset between all samples for which it will compare the signal between samples.

ADD REPLY • link 5.0 years ago by jared.andrews07 ★ 18k

0

Entering edit mode

Hi Jared thank you for your reply. I just left a same question here , please have a look. I hope you meant the same?

I am really stuck with this and looking for a way out.

ADD REPLY • link 5.0 years ago by Researcher ▴ 130

1

Entering edit mode

Yes, that looks fine, but you should really follow the advice in geek_y's answer/ATpoint's comment above. Comparing SEs in any quantitative fashion is pointless due to the size of the ranges involved. Looking at the constitutive peaks that compose them is a much more productive use of your time.

ADD REPLY • link 5.0 years ago by jared.andrews07 ★ 18k

score 2 · Answer 3 · 2019-10-29

I think I know what you mean:
I used deeptools2 for that - and it is similar to figures published by Loven J. et al. 2013 in Cell I think - they show the difference of the average binding in different conditions at SE and normal enhancer.
If so you take A.) the BED SE file and get the median or mean length of all SE. B.) take the BED SE regions and run with that (deeptools2) computeMatrix with the option computeMatrix scale-regions -S <biwig file(s)> -R <bed SE regions> -b <media or mean size of your SE> (you need to convert your bam files to bigWig by bamCoverage for example C.) you could use the underlying matrix values for further quantitative assessments and D.) plot the values for example by plotHeatmap.
Hope that helps & best wishes