Hello,
I'm trying to use Deseq2 to find differences in microbiota composition in a 2x2 crossover study. I am not sure what model to use to detect differences between P (which is a placebo) and T (treatment). Each subject only has the two time-points (one with P and one with T). Below is my colData
.
Subject Treatment Phase Sequence Combined1 Combined2
S1 T One BA T_One T_BA
S1 P Two BA P_Two P_BA
S2 P One AB P_One P_AB
S2 T Two AB T_Two T_AB
S3 T One BA T_One T_BA
S3 P Two BA P_Two P_BA
S4 T One BA T_One T_BA
S4 P Two BA P_Two P_BA
S5 P One AB P_One P_AB
S5 T Two AB T_Two T_AB
S6 P One AB P_One P_AB
S6 T Two AB T_Two T_AB
S7 T One BA T_One T_BA
S7 P Two BA P_Two P_BA
S8 P One AB P_One P_AB
S8 T Two AB T_Two T_AB
I tried combining Treatment with Phase and Treatment with Sequence and then using the below regressions (switching out Combined1 and Combined2 leads to the same error message).
dds <- DESeqDataSetFromMatrix(countData = data,
colData = colData,
design= ~ Subject + Combined1)
Whenever I do this, I just end up getting an error message of-
Please read the vignette section 'Model matrix not full rank':
I would like to test for the bacteria that are significantly different between the two treatments (P vs T), but I also want to control for sequence due to the crossover design.
What design seems appropriate?
Thank you.
is subject S2 the only subject the received sequence AB? If so, I think that might be the issue since sequence AB is confounded with subject S2.
They are not! I updated the table. I originally only included half. Thanks.
That helps. I'm not a statistician, but the current design of
~Subject + Combined1
doesn't seem to me like a suitable way to test for treatment effect.I googled "DESeq2 2x2 cross over design", there are some discussions on Bioconductor support and on Biostars, for example https://support.bioconductor.org/p/p132767/ and https://www.biostars.org/p/482861/ but I think your study design or question differs a bit these other discussions.
It seems to me that testing for T versus P while controlling for subject and sequence variation would use
~ Treatment + Subject + Sequence
as an initial design.Each subject was only done with only two conditions. I don't think you can include subject in the design