Hello
First post & new to this. I have peaks from cut & tag for H3K27ac (SEACR) and also for ATAC-Seq (MACS2)
I'd like to overlap them to get a file of overlapping peaks.
What is the best way to do this?
Thank you
Hello
First post & new to this. I have peaks from cut & tag for H3K27ac (SEACR) and also for ATAC-Seq (MACS2)
I'd like to overlap them to get a file of overlapping peaks.
What is the best way to do this?
Thank you
bedtools intersect
is the answer. https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html
I would follow a standard ATAC-seq/ChIP-seq pipeline and include that within the MACS2 peak calling as discussed here:
One consideration is the amount of overlap you're looking for. Default behavior will take any amount of overlap. You can change this to require a greater amount or to allow a gap between the peaks.
I prefer a stricter threshold for overlapping peaks, where in this case you might require at least 25-50% of the ATAC peak is overlapped.
Also with bedtools pay attention to what you are outputting.
In general, if you provide the ATAC-peaks as -a
and K27ac as -b
then the output will be the original ATAC-peaks that are marked by K27ac. I imagine this is your desired output, but if you wanted to see K27ac peaks that contained accessible regions, then you would input the K27ac peaks as -a
.
Also, the default behavior is to output a line from -a
for every overlap found in -b
.
Sometimes this isn't really an issue, but for example if you had a really large ATAC peaks covered by two separate K27ac peaks (assuming ATAC peaks are input as -a
), then you would get the same ATAC-seq peak twice in the output file. I usually use the -u
flag to keep a one line per peak ratio.
This is all obvious in their documentation which is comprehensive, and sorry if it's all obvious to you, but it wasn't obvious to me originally.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
In R one would use
findOverlaps()
orsubsetByOverlaps()
from the GenomicRanges package.