DESeq2 doesn't compare the right ctrl vs treatment?
1
1
Entering edit mode
10.3 years ago
madkitty ▴ 690

We have a very simple experiment, 1 control (C) and 3 independent treatments (T, B, A). So technically I wrote the following script thinking it would compare C to T, then C to B and finally C to A.

Though at the bottom of my pipeline, it seems that DESeq2 is comparing treatments T vs A. Is there a better way to write the condition so it would compare independentely control to each treatment ? (After that I should cluster genes and output a heatmap)

# DESeq1 libraries
library( "DESeq2" )
library("Biobase")
# Heatmap libraries
library( "genefilter" )
library(gplots)

r2 = read.table("matrix.txt", header=TRUE, row.names=1)
head(r2)
samples <- data.frame(row.names=c("C1", "T1", "B1", "A1"), condition=as.factor(c("C1", "T1", "B1", "A1")))
dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples, design=~condition)

# Run DESeq2
dds <- DESeq(dds)

# DESeq2 Results
res <- results(dds)
head (res)

**********

> mcols(res, use.names=TRUE)
DataFrame with 6 rows and 2 columns
                       type                                  description
                <character>                                  <character>
baseMean       intermediate                  the base mean over all rows
log2FoldChange      results log2 fold change (MAP): condition T1 vs A1
lfcSE               results         standard error: condition T1 vs A1
stat                results         Wald statistic: condition T1 vs A1
pvalue              results      Wald test p-value: condition T1 vs A1
padj                results                         BH adjusted p-values
DESeq2 RNA-Seq • 14k views
ADD COMMENT
8
Entering edit mode
10.3 years ago

This has nothing to do with DESeq2. It can't read your mind that you want C1 to be the base level. You just need to samples$condition <- relevel(samples$condition, "C1") before making dds.

ADD COMMENT
0
Entering edit mode

Oh Thanks, I see where my mistake was. I re-wrote the script, can you confirm this is the proper way to use it?

samples <- data.frame(row.names=c("C1", "T1", "B1", "A1"), condition=as.factor(c("C1", "T1", "B1", "A1")))
samples$condition <- relevel(samples$condition, "C1")
dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples$condition, design=~condition)
ADD REPLY
0
Entering edit mode

That should be correct.

When you make a factor, the levels are ordered lexicographically and most programs will just assume that the first level in the list of levels (i.e., from levels(samples$condition), not the first value of samples$condition) is the base level for comparisons. To do otherwise would open a very large bag of worms (e.g., if you assume that some factors starting with C are control, then you'd piss off people working on cancer:normal comparisons).

ADD REPLY
0
Entering edit mode

Unfortunately, it returns an error message.

I have a first column with gene names (before C1), which should be row.names. Am I using data.frame properly? Should I adapt row.names?

> samples <- data.frame(row.names=c("C1", "T1", "B1", "A1"), condition=as.factor(c("C1", "T1", "B1", "A1")))
> samples$condition <- relevel(samples$condition, "C1")

> dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples$condition, design=~condition)
Error in validObject(.Object) : 
  invalid class "SummarizedExperiment" object: invalid object for slot "colData" in class "SummarizedExperiment": got class "factor", should be or extend class "DataFrame"
ADD REPLY
0
Entering edit mode
dds <- DESeqDataSetFromMatrix(countData = as.matrix(r2), colData=samples, design=~condition)
ADD REPLY
0
Entering edit mode

Thanks! it seems to be working well now. Though when I run mcols, it says T1 vs C1. What happened to treatment A and B? Is it still comparing them separately versus the control C1?

> mcols(res, use.names=TRUE)
DataFrame with 6 rows and 2 columns
                       type                                     description
                <character>                                     <character>
baseMean       intermediate                     the base mean over all rows
log2FoldChange      results log2 fold change (MAP): condition T1 vs C1
lfcSE               results         standard error: condition T1 vs C1
stat                results         Wald statistic: condition T1 vs C1
pvalue              results      Wald test p-value: condition T1 vs C1
padj                results                            BH adjusted p-values
ADD REPLY
0
Entering edit mode

The results() function has a 'coef' option (or something like that) that you can use to extract the results for the other coefficients.

ADD REPLY

Login before adding your answer.

Traffic: 1986 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6