Hi everyone,
I'm fairly new to ATAC-seq analysis and learning everything as I go.
I'm currently analyzing ATAC-seq data processed using the PEPATAC pipeline, followed by differential accessibility analysis using DiffBind. I performed triplicate experiments for each treatment group (A, B, C), and most replicates cluster well — except for one sample in the B group (B_rep3), which consistently appears separated in the PCA plot.
Here’s what I did:
- Preprocessing: PEPATAC pipeline
- Peak calling: Iterative merging to get consensus peak set
- Read counting: dba.count() on iterative peaks
- Normalization: dba.normalize() with native method and background bin size
- Analysis: dba.analyze() with both DESeq2 and edgeR
- PCA: dba.plotPCA() on raw, normalized, and analysis objects
The issue:
In the PCA plot generated by dba.plotPCA(), B_rep3 clusters apart from B_rep1 and B_rep2. However, based on preseq plots, B_rep3 actually has the best library complexity and saturation.
FRiP and read counts (before normalization):
- B_rep1: 13M reads, FRiP 0.23
- B_rep2: 12M reads, FRiP 0.25
- B_rep3: 26M reads, FRiP 0.29
I assumed normalization would correct for this, but B_rep3 still clusters away.
Interestingly, when I apply DESeq2's vst() transformation to the raw count matrix and plot PCA, the replicates cluster perfectly — but I know vst is usually for RNA-seq, not ATAC-seq.
My questions:
- Is it okay to include vst-based PCA plots for visualization in ATAC-seq, even if not used downstream?
- Would reviewers criticize the PCA separation, or is it fine as long as QC (preseq, FRiP) is strong?
- Would it make sense to include QC plots in the Supplement to justify keeping the sample?
- Why might vst give better clustering, and is it acceptable to show both?
Thanks so much in advance — any thoughts or similar experiences would help a lot!
Thank you so much for your kind and clear explanation, this really helped clarify what I was unsure about! I truly appreciate you taking the time to share your insight. Wishing you all the best in your research and everything ahead!
Don't ask me why but the spam bot got triggered and suspended you. I reinstated you, not sure why that happened.
The tone was too nice :-) The bot thought it couldn't possibly be about bioinformatics
Thank you so much for your kind and clear explanation, this really helped clarify what I was unsure about! I truly appreciate you taking the time to share your insight. Wishing you all the best in your research and everything ahead!