Hello,
I have some doubts after reading the DESeq2 vignette and different threads on this topic. If I have this data:
If my objective is to compare the Status
of the samples, in principle the formula would be ~Status
. However, there are big differences between the Zone
, as we can see in this PCA:
Would we have to make the formula like ~Zone+Status
in order to control the effect of the Zone
over Status
? Or is it better to do it only with ~Status
? If a new variable such as Sex
is added and we want to control its effect, we would have to reconvert the formula to ~Zone+Sex+Status
? Where can I find an explanation of what this "control" is based on?
Thank you
I would rely on how much the
Zones
are affecting yourStatus
. Did you try doingStatus ~ Zones
and checking the results? If you see that yourZones
are really influencing your Status, then do~ Status + Zones
. Same thing for a new variableSex
as you mentioned.Thanks. In the DESeq2 vignnete we can see that
design = ~ batch + condition
. Shouldn't it be in this order?design = ~ Zones + Status
Best