Here is the corrected and polished version of your updated question:
I have combined a series of datasets containing a specific cell type. These datasets come from different (but related) tumors, studies, and timepoints. I want to perform a differential expression analysis on these cells based on whether the patient survived or not. The problem is that, during an initial exploration with PCA, I observed that the cells cluster differently depending on some of these variables. Should I regress out these sources of variation (using the SCTransform option vars.to.regress) before performing the differential expression analysis with DESeq2? Or should I perform the differential expression analysis on the raw counts, even though they show these differences?
Another question I have is whether I can use the vars.to.regress option freely with as many variables as I want, or if doing so comes at a cost to accuracy, interpretability, etc.