May I ask how do you use your ATAC-seq data for post-alignment analysis?
Here's what I'm doing and I have a few problems:
For example, there are two groups of samples, WT and KO. In order to define the peak status, I merge all bam files (all replicates of two groups) and call a TOTAL peak using MACS2.
Then in R, WT and KO peak file is used to define peaks status by
ChIPpeakAnno::findOverlappingPeaks
, respectively. if there's overlap between WT and TOTAL, this TOTAL peak is defined as "Open in WT".ChIPpeakAnno::annoPeaks
do the annotation part with parameters(bindingType = "startSite", bindingRegion = c(-3000,3000))
.
Problem: A gene can be annotated by many peaks, which show different status, eg. "OpenToClose", "CloseToOpen","BothOpen","NeitherOpen".
Does it make sense if I use these genes for further analysis? such as GO enrichment, their expression levels in RNAseq. If not, how to do it in a better way?
Hi, ATpoint.
Thanks a lot for your reply! I have some questions about your advices. Do you mean I should change the
bindingRegion
parameter intoc(-500,0)
to annotate the peaks more precisely?…and for the DA part, actually I did use the
computeMatrix scale-regions
to build a 'expression matrix', with the parameters-a 0 -b 0 -bs 1 --regionBodyLength 1
. However, the result of following analysis using DESeq2 shows few significant differential peaks because of the high p-values/p-adjust. That’s why I turn to use the 'OpenClose' criteria.