Question

DESeq2 multiple factor design clarification

6

Entering edit mode

9.4 years ago

Nicolas Rosewick 11k

Hi,

I've a general question about deseq2 designs. Let say I've a experiment with three factors A, B and C. Each factor is independent from each other. I want to assess differential expression based on factor A. Is the following design correct ?

~C+B+A

other question. Is the design ~B+C+A the same thing than doing ~C+B+A ?

Thanks

deseq2 design • 7.1k views

ADD COMMENT • link updated 22 months ago by Ram 44k • written 9.4 years ago by Nicolas Rosewick 11k

Ram · Answer 1 · 2015-06-03

9

Entering edit mode

9.4 years ago

Devon Ryan 104k

The order doesn't matter, ~C+B+A is the same as ~A+B+C or ~B+A+C and so on. This is actually generic to R, so the exact same designs apply to limma, edgeR, aov, lm() and so on.

One caveat is that within DESeq2, plotting is done according to the last factor in the model, so ~A+B+C will yield different plots than ~C+B+A. That's why it's recommended to put the factor of highest interest last.

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Devon Ryan 104k

1

Entering edit mode

re: plotting. After the first release, I realized that it's better to have the output of results() be the input to plotMA, so users have to specify the comparison to plot. plotMA(dds) is really just plotMA(results(dds)) and so the docs don't mention plotMA(dds) anymore

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Michael Love ★ 2.6k

1

Entering edit mode

Time for me to read the most recent docs :)

ADD REPLY • link 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

OK thanks. But let's say that factor A represent the cell type and factor B the presence of a specific mutation in a gene. I want to detect the effect of the presence of this specific mutation (let's drop factor C for now). So my design would be ~A+B. But if the order in the design in not important; and as the transcriptome of different cell types are very different, DESeq will give me the genes that are differentially expressed based mostly on factor A ( I don't expect to have an important effect due to the mutation defined in factor B). Is that correct?

ADD REPLY • link updated 22 months ago by Ram 44k • written 9.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

The the effect of B is small, then yes, you should get mostly (perhaps only) DE genes due to A.

ADD REPLY • link 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

so if I use the design ~B it would only give me genes DE based on the mutation (factor B) but a problem arise due to different cell types comparison.. Not simple..

ADD REPLY • link 9.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

I suspect you're forgetting that you can use ~A+B and specify which coefficient you want the results for (see help(results)). So you can still use ~A+B and only get the results for B.

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Devon Ryan 104k

2

Entering edit mode

someting like

> resultsNames(dds)
[1] "Intercept"    "AcellType1" "AcellType2"  "Bmut"    "BnoMut" 

> res <- results(dds,name="Bmut")

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Yup, that'd give you the effect of Bmut while controlling for everything else.

ADD REPLY • link 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

I tried both results(dds) and results(dds,name="Bmut") and it gives me the same p-values (foldchange are different though) .. maybe I'm missing something . My design is slighty different as in the upper example: I've a factor B with three conditions (WT,mut1, mut2). factor A remain the cell type So:

design=~A+B
reduced=~A
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomLRT(dds,full=full,reduced=reduced)

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

I'm not sure how that would work in the context of an LRT, why not use a wald test?

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

I thought if a factor has more than two conditions, it's better to use the LRT function. My bad.

ADD REPLY • link 9.4 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Only if you don't care about the effect of each level. The LRT is asking, "Is there a difference due to B?" What I suspect you actually want to ask is, "Is there an effect of Bmut vs. BWT?", or whatever else the 3rd level of B is called. That sort of question is most easily addressed with a Wald test.

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

OK I understand a little bit better now. Thanks Devon.

ADD REPLY • link updated 22 months ago by Ram 44k • written 9.4 years ago by Nicolas Rosewick 11k