Entering edit mode
4.5 years ago
ATCG
▴
400
Hi, I am using DESEq2 to compute differential gene expression, and if I am interpreting this correctly, changing the names of categories in the colData reverses the test that is performed. The matrix counts is intact with 6 col and (WT)3 (hux)3. Here WT=wild type or control and hux=treatment. I want to perform the test to obtain LFC hux/control(WT). I encountered this issue yesterday, and I tried to articulate the problem the best I could, so I hope my question makes sense.
expdesign = read.delim("expdesign.txt", colClasses='factor')
WT_1 control
WT_2 control
WT_3 control
hux_1 hux
hux_2 hux
hux_3 hux
mcols(res, use.names = T)
log2FoldChange results log2 fold change (MLE): condition hux vs control
L2FC is hux/control
expdesign = read.delim("expdesign.txt", colClasses='factor')
WT_1 WT
WT_2 WT
WT_3 WT
hux_1 hux
hux_2 hux
hux_3 hux
mcols(res, use.names = T)
log2FoldChange results log2 fold change (MLE): condition WT vs hux
L2FC is WT/hux
I suspect this is a consequence of turning a column into a factor. The factor levels are by default set alphabetically. You can check if this is the cause by changing the factor levels and seeing if that changes the constrast.
Actually this will yield the same results: expdesign = read.delim("expdesign.txt")
Also, DESeqDataSetFromMatrix will convert to factor with a warning which is why I am reading them as factors.
If DESeqDataSetFromMatrix is coercing it to a factor, the factor levels will also be set alphabetically. After you import the first data.frame in your example, try both
expdesign[[2]] <- factor(expdesign[[2]], levels=c("control", "hux"))
as well asexpdesign[[2]] <- factor(expdesign[[2]], levels=c("hux", "control"))
. Of course you can just ignore this too and set the contrast manually when returning the results.Yes, this works thank you so much for your help!
How would you " ignore this too and set the contrast manually when returning the results?"
If you give your columns names, such as
colnames(expdesign) <- c("sample", "condition")
, you can specify what comparison you want to make at the end. For example, you could doresults(dds, contrast=c("condition", "control", "hux"))
orresults(dds, contrast=c("condition", "hux", "control"))
.Thank you, rpolicastro. I also found the answer to this on pg 14 of Beginner's guide to using the DESeq2 package. http://www.bioconductor.org/packages/2.14/bioc/vignettes/DESeq2/inst/doc/beginner.pdf
Do the following "to make sure that Control is the rst level in the treatment factor, so that the default log2 fold changes are calculated as treatment over control and not the other way around".