I have an experiment with 2 different groups, and several different time points (for simplicity, let's say it is 2 time points), and I'm trying to find the best way to model both group and time effects. Note that time
is a categorical variable.
# in
dds <- makeExampleDESeqDataSet(n=100,m=32)
dds$time <- factor(rep(rep(c("T1","T2"),each=8),2))
dds$group = factor(rep(rep(rep(c("Ct","Tr"),each=4),2),2))
design(dds) <- ~ time + group + time:group
dds <- DESeq(dds)
colData(dds)
# out
DataFrame with 32 rows and 4 columns
condition group sizeFactor replaceable
<factor> <factor> <numeric> <logical>
sample1 A Ct 1.080654 TRUE
sample2 A Ct 1.009824 TRUE
sample3 A Ct 1.035633 TRUE
sample4 A Ct 0.953337 TRUE
sample5 A Tr 1.060243 TRUE
... ... ... ... ...
sample28 B Ct 1.22126 TRUE
sample29 B Tr 1.04888 TRUE
sample30 B Tr 1.09366 TRUE
sample31 B Tr 1.11761 TRUE
sample32 B Tr 1.23430 TRUE
We can also see the factors that were computed:
# in
resultsNames(dds)
# out
[1] "Intercept" "group_Tr_vs_Ct" "time_T2_vs_T1" "groupTr.timeT2"
Modeling these means we are doing
(Equation 1) y = alpha + beta1 time + beta2 group + beta3 group time
Or using the levels above, the following beta coefficients are defined:
(Equation 2) y = intercept + time_T2_vs_T1 time + group_Tr_vs_Ct group + groupTr.timeT2 group time
Given the reference levels are Ct
and T1
(which I am assuming will be encoded as 0
by the model), I'd like to confirm if my interpretation of the following effects makes sense:
- Effect of Tr group (for T1):
group_Tr_vs_Ct`
- Effect of T2 time (for Ct):
time_T2_vs_T1`
- Effect of Tr group (for T2):
group_Tr_vs_Ct + groupTr.timeT2`
- Effect of T2 time (for Ct and Tr), i.e. regardless of group or (Tr.T2 - Tr.T1) - (Ct.T2 - Ct.T1):
(group_Tr_vs_Ct + groupTr.timeT2 - group_Tr_vs_Ct) - (time_T2_vs_T1 - 0) = groupTr.timeT2 - time_T2_vs_T1
- Effect of Tr group (for T1 + T2), i.e. regardless of time or (Tr.T2 - Ct.T2) - (Tr.T1 - Ct.T1):
(group_Tr_vs_Ct + groupTr.timeT2 - time_T2_vs_T1) - (group_Tr_vs_Ct - 0) = groupTr.timeT2 - time_T2_vs_T1
I've been checking a few posts with similar questions that have been quite helpful (example, example and especially here), but I'm uncertain of the interpretation of more complex comparisons. So my questions are:
- Does the interpretation of the effects above make sense?
- How to retrieve the final contrast above?
- Why is it that
intercept
is not considered for any of the above, in the documentation and posts? Given Equation 1, one would think the intercept would be added to each of the terms above.
Thanks for your comment @swbarnes2, I agree with you that it becomes quite complex. My question is how to specify the contrasts for multiple columns in that situation? For the last example, and considering your approach, I would get coefficients for each of the four factors
Ct.Tr1
,Ct.Tr2
,Tr.T1
andTr.T2
. I see it is straightforward to contrast a pair of these (e.g.c( group, Tr.T2, Tr.T1)
), but how to find a condition contrast regardless of time in this case (i.e.(Tr.T2 - Ct.T2) - (Tr.T1 - Ct.T1)
)?edit: would such conditions be given as a vector of numbers based on the order of
resultsNames( )
? For instanceI could retrieve the appropriate contrasts as follows
Would this make sense?