I have created a cross-study (20, >250 samples in total) RNA-seq expression matrix of the same cell type (different conditions) and planned to do a combined DE analysis - in terms of checking for similarities of DE across different comparisons. By 'pooling' all controls, defining condition + series as design to account for the batch effect, I have then plotted the PCA to get an idea of the data.
I then removed the batch effect using limma for visualization purposes (as suggested here: https://support.bioconductor.org/p/76099/#93176) Here, the DESeq2 transformed PCA without and with removal of the 'study'/batch effect (color=study):
Since the variation of 'control' samples across the different studies is still very large after removing the 'series' batch effect, I would like to do WT vs. condition comparisons for every study separately - is that a reasonable approach and doable with DESeq2? Most of the studies do have replicates.
Thank you very much for your input!