I am working on an RNA-seq experiment with two groups of patients that are highly unbalanced in size: one group has 710 patients, while the other has only 15. I’m concerned about how this imbalance might impact the differential expression analysis, particularly with respect to statistical power and the accuracy of the results.
I plan to use DESeq2 for the analysis and would like to know if there are specific strategies or adjustments within DESeq2 that can help manage this large disparity in group sizes. Are there any best practices for using DESeq2 in such cases to minimize potential biases? I would appreciate any insights or suggestions for handling this situation effectively!
See answer from @Mike Love author of DESeq2 that addresses your question --> https://support.bioconductor.org/p/p134634/
Would also recommend reading through https://support.bioconductor.org/p/87507/