You can use GenomicRanges in R to merge peaks (as either the union or intersection, depending upon what you are looking for).
For example, let's say you have a .bed file (without a header), and you can create a GenomicRanges object as follows:
gr1 = reduce(GRanges(Rle(input.table$V1),
IRanges(start=input.table$V2, end=input.table$V3)))
In this situation, I don't think the reduce
command is necessary (to merge overlapping rows), but it won't hurt.
If you wanted to then create the union of two peaks, you could use:
peakGr = union(gr1, gr2)
Of, if you wanted to create the intersection of two peaks, you could use:
peakGr = intersect(gr1, gr2)
You can then quantify counts with featureCounts, htseq-count, etc. Of, if you only have 2 samples, you can try comparing one sample as the reference for peak calling (for what is intended to be used for "input" samples).
Thanks Charles for your suggestion.
You are welcome - although this is a comment to the question, rather than my answer :)
Thanks Friederike.
For example, I see two different peaks in a 5kb Gene. A peak near 5'UTR and first exon and another peak near exon 7.
I am not sure if chromatin is open only at these two sites. For same samples, I see RNA expression is very high in test sample compared to control. I was wondering, can a Gene have just two ATAC peaks at these two exons and closed on rest of Gene and expression can be that high. I know, associating expression to ATAC is not that straightforward, however, I felt I missing some knowledge here such as A. Should we assume that chromatin is open for this Gene, based on these two peaks. In such case, can I combine peaks.
OR
B. Observing two peaks in test and not in control at two different sites and no evidence of peaks in other sites of genes, should be considered as open chromatin for the gene.
As ATAC-seq may not be that sensitive to show large peak over the entire gene
Thanks
If you could attach a screenshot with coverage tracks, it becomes easier and meaningful to interpret the signal.
In concur with Chirag plus I will say that drawing conclusions on how to treat all peaks based on a single observation is probably not the most robust way forward.