I want to perform a differential expression analysis with DEseq2. I have 2 conditions (input (whole cell) and a cell fraction) and 2 treatments (treated and wild type) in replicates (for simplicity, here 2).
I'm interested in the differential expression in the fraction following treatment. I think my design should be something like (Treated_fraction / Treated_input) / (WT_fraction / WT_input).
This is my countData
> head(countData)
Geneid Length Treated_fraction_1 Treated_fraction_2 Treated_input_1 Treated_input_2 WT_fraction_1 WT_fraction_2 WT_input_1 WT_input_2
1 ENSG00000223972 1756 0 0 0 0 0 0 0 0
2 ENSG00000227232 2073 29 22 31 47 24 12 23 13
3 ENSG00000243485 1021 0 0 0 0 0 1 0 0
4 ENSG00000237613 1219 0 0 0 0 0 0 0 0
5 ENSG00000268020 947 0 0 0 0 0 0 0 0
6 ENSG00000240361 940 0 0 0 0 0 0 0 0
This is my colData
> colData
assays conditions replicates
Treated_fraction_1 Treated_fraction Treated 1
Treated_fraction_2 Treated_fraction Treated 2
Treated_input_1 Treated_input Treated 1
Treated_input_2 Treated_input Treated 2
WT_fraction_1 WT_fraction WT 1
WT_fraction_2 WT_fraction WT 2
WT_input_1 WT_input_1 WT 1
WT_input_2 WT_input_1 WT 2
So far my command is
dds <- DESeqDataSetFromMatrix(countData = subset(countData, select = -Length),
colData = colData,
design = ~ assays + conditions + assays:conditions,
tidy=TRUE)
but this give me the following error
Error in checkFullRank(modelMatrix)
Which appears to derive from the replicates column.
What would be the correct colData and design to use in this case?
Following this, I usually do
deseq.results <- results(dds, contrast=c("conditions", A, B))
What would be the correct results command for this analysis?
This is a simplified version with 2 conditions, can it be generalized to more conditions (i.e. input and multiple fractions).
Thanks!
Hi, I have the exact same problem, so I would be interested in knowing your updates about this. We have a fraction and the cell as assay and treated and control as conditions. So for
colData
, I have one column "assay" with fraction and cell, and another column "condition" with treatment and control. I was said that the design would be (fraction_treat/cell_treat) / (fraction_ctl/cell_ctl), but I used the same formula as you:So I think it's a design like yours (i.e. (Treated_fraction / Treated_input) / (WT_fraction / WT_input)).
I followed this post from Michael Love and used this command:
I don't know if a good way to do it, because if I understood well, here I use the Likelyhood ratio test (LRT) which allows to compare 2 models. One which takes into account all the possible effects and another one in which you use "reduced model" (the one here) that take off the effect of the interaction and thus allows to quantify this effect. If you think I didn't understand well, feel free to tell me.
Anyway, if you have any news about this, I would be happy to share it with you.