I have performed ATAC-seq and have three biological replicates and would like to combine their macs2 narrowPeak files to get the common peaks between them, so I can run GO overrepresentation on it after annotating it to the nearest TSS so I can have GO terms specific to that condition. What is the best way to merge these narrowPeak files so I can achieve this?
Should I merge the BAM files at the start for the replicates in picard and then call peaks on the merged BAM with macs2 or should I use bedtools multiinter to find intersecting intervals or bed tools merge on narrowPeak files to combine replicates or is there a better way?
Thanks.
Peak calling won't ever give you specific results. You can easly get thousands of peaks fewer or more if noise is a little higher or lower in one vs the other condition. If you want to enrich for condition specificity then perform a differential analysis with tools like DESeq2, edgeR or limma. Meaning, call peaks, make a consensus peakset (many posts on this before here and elsewhere) and then assess significance of count differences by mentioned tools.
Thank you for your answer. Would maybe running IDR on the replicates help? It would get reproducible peaks amongst the biological replicates right? I really want to produce a figure like below I found from this article: https://www.researchgate.net/figure/Gene-ontology-and-KEGG-pathway-analysis-of-the-mouse-ATAC-seq-data-a-Top-ten-GO-terms_fig3_350469741. To do this I feel like I'd need to run something like chipseeker and annotate the peaks in each condition group and would really like the wild type group to also be present on the figure too.