Hi, I’m trying to correctly identify some DEGs using DESeq2 in a mini-time series (2 timepoints, early and late infection). Simply grouping everything into one overall “treatment” variable works fine for some things. But in other cases, I want to better account for DEGs only due to the samples being older, and not related to disease progression. Having a little trouble with that.
Key sample characteristics:
levels(sample_scheme$treatment)
[1] "control_10" "control_20" "A_10" "A_20" "B_10" "B_20" "C_10" "C_20"
levels(sample_scheme $infection)
[1] "control" "A" "B" "C"
levels(sample_scheme $time_point)
[1] "10" "20" # these are loaded in as factors
What I can do - compare gene expression between infections at the same time, e.g., from A_10
to B_10
:
dsTxi_trt <- DESeqDataSetFromTximport(txi, colData = sample_scheme, design = ~treatment)
dds_trt<-DESeq(dsTxi_trt)
results(dds_trt, contrast = c("treatment", "B_10", "A_10"), alpha=0.05, pAdjustMethod="BH", lfcThreshold = 0.585, altHypothesis = "greaterAbs") # 1.5x fold change hypothesis
This works fine as far as I can tell.
A lot changes just due to sample time however:
...contrast = c("treatment", "control_20", "control_10")
... gives many thousand DEGs
So, presumably there’d be a strong signal of this uninteresting change wrapped up in a comparison of interest like early to late infection e.g., A_10 to A_20.
If I want to ask, how does expression change from A_10 to A_20, while accounting for the many expected changes due to time regardless of which infection treatment was given (so essentially, looking for the infection:time_point
interaction) – I tried this:
dsTxi_fullmod <- DESeqDataSetFromTximport(txi, colData = full_mod_scheme, design = ~infection + time_point + infection:time_point)
dds_fullmod<-DESeq(dsTxi_fullmod)
dds_finalmod<-DESeq(dsTxi_fullmod, test="LRT", reduced=~infection + time_point)
results(dds_finalmod, contrast=list(c("time_point_20_vs_10", "infectionA.time_point20")), test="Wald", alpha=0.05, pAdjustMethod="BH", lfcThreshold = 0.585, altHypothesis = "greaterAbs")
But this gives basically the same number of DEGs (when checked with summary(result)
) as the simple contrast = c("treatment", "A_20", "A_10")
using the group variable. There is a difference of only 2 DEGs and 1 outlier between methods – I don’t think I was able to properly control for “background” effects of time (samples getting older).
[design = ~time_point * infection
I gather would let me compare A to B ‘generally’, while accounting for the different sampling times. This is not what I want to do, I’m most interested in A to B at the same time, and A to A at different times.]
Any suggestions to achieve what I’m after here?
Thank you!