Hello,
I am seeking advice on my DESeq2 analysis setup.
Our experimental design includes 2 treatment conditions (treatment and control) and 3 exposure times (48h, 24h, 12h), resulting in a total of 4 samples per group. We are also considering a batch effect, with 2 samples in each batch. For every exposure time, we have designated a control. It's important to note that for the treatment, we replace the plate containing the cells, and consequently, the control is also changed. Notably, all cells from every condition are harvested on day 2, meaning that those exposed for 1 day or 12 hours have their exposure during the final 24 or 12 hours, respectively, before the collection, living under control conditions before their exposure period.
Our primary objective is to see any significant differences between the treatment and control at the different exposure times. Additionally, we aim to identify any genes that are differentially expressed across the exposure times specifically within the treatment condition.
To address the first objective, I grouped the metadata by Condition and Time, leading to the following design formulation:
coldata$TimeCondition <- paste(coldata$Time, coldata$Condition, sep="_")
design = ~ Batch + TimeCondition
dds = DESeqDataSetFromMatrix(counts, coldata,
design = design)
For the second objective, I am contemplating the following design, although I am uncertain about extracting the results. I believe this design, in conjunction with the contrast() function, might enable the extraction of pertinent information for the two objectives:
design = ~ Batch + Time + Condition + Time:Condition
dds = DESeqDataSetFromMatrix(counts, coldata,
design = design)
With this design, how can I appropriately extract and interpret the relevant information? I think that with the below code I can get the results for treatment vs control at 48 hours, because I have the same results than with ~ Batch +TimeCondition:
mod_mat <- model.matrix(design(dds), colData(dds))
treatment_48 <- colMeans(mod_mat[dds$Condition == "treatment" & dds$Time == "48", ])
control_48 <- colMeans(mod_mat[dds$Condition == "control" & dds$Time == "48", ])
results(dds, contrast = treatment_48 - control_48)
Has the batch effect been accounted for in the results of this comparison? And what about getting the result for treatment 48 vs 24, taking into account the different controls for each time?
I would greatly appreciate any guidance or suggestions on this matter.
Thank you.