Question

Best way to combine biological replicate peaks?

0

Entering edit mode

7 months ago

SpartanII • 0

I have performed ATAC-seq and have three biological replicates and would like to combine their macs2 narrowPeak files to get the common peaks between them, so I can run GO overrepresentation on it after annotating it to the nearest TSS so I can have GO terms specific to that condition. What is the best way to merge these narrowPeak files so I can achieve this?

Should I merge the BAM files at the start for the replicates in picard and then call peaks on the merged BAM with macs2 or should I use bedtools multiinter to find intersecting intervals or bed tools merge on narrowPeak files to combine replicates or is there a better way?

Thanks.

ATAC-seq • 1.1k views

ADD COMMENT • link updated 7 months ago by jared.andrews07 ★ 19k • written 7 months ago by SpartanII • 0

0

Entering edit mode

have GO terms specific to that condition

Peak calling won't ever give you specific results. You can easly get thousands of peaks fewer or more if noise is a little higher or lower in one vs the other condition. If you want to enrich for condition specificity then perform a differential analysis with tools like DESeq2, edgeR or limma. Meaning, call peaks, make a consensus peakset (many posts on this before here and elsewhere) and then assess significance of count differences by mentioned tools.

ADD REPLY • link 7 months ago by ATpoint 90k

0

Entering edit mode

Thank you for your answer. Would maybe running IDR on the replicates help? It would get reproducible peaks amongst the biological replicates right? I really want to produce a figure like below I found from this article: https://www.researchgate.net/figure/Gene-ontology-and-KEGG-pathway-analysis-of-the-mouse-ATAC-seq-data-a-Top-ten-GO-terms_fig3_350469741. To do this I feel like I'd need to run something like chipseeker and annotate the peaks in each condition group and would really like the wild type group to also be present on the figure too.

enter image description here

ADD REPLY • link 7 months ago by SpartanII • 0

score 1 · Answer 1 · 2025-04-30

As ATpoint mentions, this is kind of a messy process. I have done it both ways - subsampling and merging replicate BAMs into one and using that for peak calling for simplicity's sake or post-hoc merging of peaks called for each replicate.

Lately, I've been using rmpsc for deriving consensus peaksets from replicates on a per-group basis, and then merging those peaksets across groups for differential analyses. Which seems to work okay and clean things up a bit.