While analyzing an ATAC-seq data set with DEseq2, I got the below histogram of P values when comparing the WT and KO conditions. Since I didn't get a uniform distribution or one with a higher frequency of low p values, I think some assumption of how DEseq2 does the analysis is being violated. Whatever issue I have doesn't seem to be across all of my data, however, since I also have accessibility at multiple differentiation stages for each genotype, and between those stages, I get a well behaved distribution with many small p values. I certainly expect the differences in accessibility to be larger between different differentiation stages than between the WT and KO at the same stage, but the main goal of my analysis is to identify the differences between the WT and KO at these stages. Based on other resources I've found online, I've tried several things to resolve this. I repeated the analysis with other packages, including edgeR and limma/Voom, tried to correct the P values with fdrtool, and tried removing less accessible peaks. None of these solutions made a big difference though. Are there better ways of doing the analysis?
Thank you!
Instead of abstract pvalue distributions you should add standard QC plots and metrics. For example, FRiPs, PCA; MA-plots between conditions. Something to actually "look" at the data.
Most likely you have a confounder (e.g., a batch effect), since that's the most common cause of this sort of distribution. As suggested by ATpoint, PCA or similar QC plots will be informative here. More likely than not you just left something out of your model (e.g., stage) and this is the confounder.
Below are the PCA plots generated from running the following code:
The design I'm using is ~replicate + condition where condition contains both the genotype and stage as in the PCA plot legend. I've also tried a couple other designs with genotype and stage separate and/or accounting for an interaction between the two and it hasn't really changed the resulting p values.
You'll really want to remove the
limma::removeBatchEffect()
line.