Best way to combine biological replicate peaks?
1
0
Entering edit mode
17 days ago
SpartanII • 0

I have performed ATAC-seq and have three biological replicates and would like to combine their macs2 narrowPeak files to get the common peaks between them, so I can run GO overrepresentation on it after annotating it to the nearest TSS so I can have GO terms specific to that condition. What is the best way to merge these narrowPeak files so I can achieve this?

Should I merge the BAM files at the start for the replicates in picard and then call peaks on the merged BAM with macs2 or should I use bedtools multiinter to find intersecting intervals or bed tools merge on narrowPeak files to combine replicates or is there a better way?

Thanks.

ATAC-seq • 440 views
ADD COMMENT
0
Entering edit mode

have GO terms specific to that condition

Peak calling won't ever give you specific results. You can easly get thousands of peaks fewer or more if noise is a little higher or lower in one vs the other condition. If you want to enrich for condition specificity then perform a differential analysis with tools like DESeq2, edgeR or limma. Meaning, call peaks, make a consensus peakset (many posts on this before here and elsewhere) and then assess significance of count differences by mentioned tools.

ADD REPLY
0
Entering edit mode

Thank you for your answer. Would maybe running IDR on the replicates help? It would get reproducible peaks amongst the biological replicates right? I really want to produce a figure like below I found from this article: https://www.researchgate.net/figure/Gene-ontology-and-KEGG-pathway-analysis-of-the-mouse-ATAC-seq-data-a-Top-ten-GO-terms_fig3_350469741. To do this I feel like I'd need to run something like chipseeker and annotate the peaks in each condition group and would really like the wild type group to also be present on the figure too.

enter image description here

ADD REPLY
1
Entering edit mode
16 days ago

As ATpoint mentions, this is kind of a messy process. I have done it both ways - subsampling and merging replicate BAMs into one and using that for peak calling for simplicity's sake or post-hoc merging of peaks called for each replicate.

Lately, I've been using rmpsc for deriving consensus peaksets from replicates on a per-group basis, and then merging those peaksets across groups for differential analyses. Which seems to work okay and clean things up a bit.

ADD COMMENT

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6