Best Practices for ATACseq for Performing Differential Accessibility Analysis
1
1
Entering edit mode
2 days ago

Hello,

I have a question about best practices for performing differentially accessibility analysis for ATACseq data. I am a recent college graduate working in my first bioinformatics lab, so I apologize in advance if my terminology or wording is confusing/not fully correct.

I am analyzing bulk ATACseq data from treated and control samples isolated at 3 different timepoints. I have 3 replicates per condition for each of the 3 timepoints, so there are 18 samples total. We have previously performed bulk RNA-seq on these same samples, which showed fairly robust differential gene expression based on treatment. Given this, my initial hypothesis was that we would also find differentially accessible regions dependent on treatment corresponding to the changes in gene expression. However, after my first pass of the data, I found very low (<50) to no differentially accessible regions between treated and control samples (or even between timepoint).

As background on how I processed the samples:

  1. I processed the data via the nf-core ATACseq pipeline (standard processing).
  2. I then used MACS3 to call peaks for each of the samples, which identified anywhere from ~35k to ~75k significant peaks per sample. From those sample-specific peaks,
  3. I then used bedtools to merge peaks from all 18 samples to generate a list of ~300k consensus peaks present in at least one of the 18 samples.
  4. I then used the csaw package to generate a matrix of raw counts present in the regions defined by these 300k peaks.
  5. With this raw counts matrix for all 18 samples, I performed differential accessibility testing between treated and control samples using DESeq2. However, like I mentioned above, there were very few to no consensus peaks for several of the contrasts between treated and control samples that I performed.

I am thinking of reasons as to why so few differentially accessible regions were identified (the robust differential gene expression between treatment/control leads me to believe that we should also observe at least some modest differentially accessibility). My first thought is that, because the consensus peak list is so large (300k peaks) for the 18 samples, the large number of tests being conducted means the FDR would make it "difficult" to identify statistically significant differentially accessible regions.

Secondly, when I merged peaks across the 18 samples to obtain the 300k consensus peak list, I used a 1bp overlap to identify a peak as shared between any two samples. So because I took the whole range of peaks within that overlap to define the region of the "shared peak", some of the shared peaks might be quite large. I am not sure if this is true, but I would imagine that there is a lot of "noise" or random signal the further away you are from the summit of the peak. So basically, if a merged peak had a 3k bp window, there could be a lot of random noise in that huge window that could affect the ability to call differential peaks between samples.

Given that reasoning, I am thinking about how to revise this analysis. One idea I had is to merge called peaks for just the specific samples I am interested in doing DA analysis for, and run DESeq2 on that smaller consensus peak list. The second idea would be to first identify peaks shared between all 3 replicates for a given timepoint and condition. And then taking this consensus peak list for each of the 6 groups of 3 replicates and merging them (which would also reduce the total # of peaks, I think).

I would appreciate if anybody has feedback on my reasoning and has suggestions/insights for how I might approach this data. Again, I apologize if I was not clear in describing the data and my thoughts. I can provide my code if that is helpful.

differential-accessibility csaw DESeq2 ATACseq • 284 views
ADD COMMENT
1
Entering edit mode

Your reasoning makes sense to me. I do something similar. Usually I am stringent on per-sample peak calls by fording a minimum peak width and FDR cutoff, and then I only keep peaks shared between at least two/three/... (depends on group size) samples. That massively reduces the total number of peaks and helps with multiple testing burden. It also eliminates a lot of noisy/nonsense peaks.

ADD REPLY
2
Entering edit mode
14 hours ago
Malcolm.Cook ★ 1.5k

I have had great success using genrich as peak caller for ATAC-Seq upstream of csaw. I recommend you consider it.

With genrich, replicates are jointly analyzed, allow you to produce a single peak-set for each experimental condition (note: I believe that using genrich parameters -q .05 -a 0.0 are arguably sensible and find they produced empirically plausible results).

In your case, you would produce 6 peak-sets, one for each combination of time-point and condition. (drag&drop them into IGV to visualize!)

These peak-sets are in narrowPeak format, with each peak attributed with values used in its calling, along with a final pValue and qValue, allowing subsequent filtering, if so desired.

You can then merge your 6 (possibly filtered) condition-specific peak sets into a single reference peak-set. (note: I prefer the term reference since consensus connotes agreement, whereas you may well find peaks in one condition that are absent in another, lacking consensus, which will undoubtedly be quite interesting to your analysis).

You may implement the merge using bedtools, but I found it valuable (e.g. for visualization in IGV) to instead use the capabilities provided by R/BioConductor's GenomicRanges::reduce to merge proximal (e.g. within 100bp) peaks and carry forward, as narrowPeak attributes of each resultant reference peak, the narrowPeak attributes of the input merged peak having the smallest pValue. (again: drag&drop into IGV to visualize!)

If you do this, I bet your reference set has far fewer than your current 300K. Let us know!

I then use csaw to (a) test for differential accessibility every 50bp window overlapping any peak in the reference peak-set (specifically using csaw's glmQLFTest with adjusted p value=.05) and (b) combine the test results at the reference peak level (thus controlling the overall false discovery rate), using csaw::overlapResults.

The results seem quite plausible in terms of downstream interpretations (e.g. motif enrichment, GO analysis, etc) recapitulating know players.

Let us know how it goes for you!

FWIW: some un-discussed aspects:

  • read mapping - I use STAR
  • use of a "blacklist" bed file - I make one from INPUT ChIP-Seq loosely following https://www.nature.com/articles/s41598-019-45839-z - depending on your organism/genome version you may find one from another lab readily available
  • production of 50bp window coverage matrix - I used bedtools makewindows and featureCounts for tabulation followed by quantile normalization (because my sample preps seemed so disparate in dynamic range for historical reasons)
  • QA
ADD COMMENT
0
Entering edit mode

In my own work I made the following observations:

(1) Unless I'm looking at really distinct samples (like different tissues), I almost never see a region go from completely closed in one condition to open in another

(2) Even when there are very large differences between conditions, the background of ATAC is generally so clean that calling peaks on all samples simultaneously gives a set of peaks with a Jaccard index of >0.95 to what I get from calling separately and using bedtools intersect

(3) I tend to "believe" the unique peaks from calling by merging at fragment level versus those, moreso than the unique peaks from per-condition followed by bedtools, they seem more likely to be in promoters (for instance).

This is not a suggestion that this approach is OK necessarily, but an invitation for discussing any observations you (or any reader) may have that run counter to my own.

ADD REPLY

Login before adding your answer.

Traffic: 3673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6