Question

DESeq2 : three factor conditions

18

Entering edit mode

10.1 years ago

Nicolas Rosewick 11k

Hi,

I've a small question concerning differential expression analysis. So I've a 12 samples divided into three groups (A, B and C ). If a do that :

samples <- read.table("samples.txt",sep="\t",as.is=T)
colnames(samples)<-c("sampleName","fileName","condition")
dds1 <- DESeqDataSetFromHTSeqCount(sampleTable=samples,directory="./",design= ~ condition)
dds1 <- DESeq(dds1)
res <- results(dds1)

What will the p-value represent? If it's significatif, is it because the gene of interest is up or down regulated in one of the three groups?

To do pairwise comparison, I do that. Is it correct?

res.A.B <- results(dds1, contrast=c("condition","A","B"))
res.A.C <- results(dds1, contrast=c("condition","A","C"))
res.B.C <- results(dds1, contrast=c("condition","B","C"))

Thanks

mulitpleconditions DESeq2 • 28k views

ADD COMMENT • link updated 2.8 years ago by Ram 44k • written 10.1 years ago by Nicolas Rosewick 11k

Ram · Accepted Answer · 2014-10-16

14

Entering edit mode

10.1 years ago

Devon Ryan 104k

By default, DESeq2 uses a Wald test to compute the p-values, so it's likely that res just has the output of the A vs. C contrast (rather than using a likelihood ratio test to compare ~condition versus ~1). Regardless of the test, the direction of change and the p-value are separate, so you'll have significant findings in both directions.

Yes, the method you showed for the pairwise comparisons looks correct.

BTW, if you really do want to test whether something is changed due to condition in general (i.e., do something like a one-way ANOVA), you can just use the LRT methods. Keep in mind that the direction of change there can be questionable, since you're then asking if the full model fits better than the reduced rather than asking of parameter estimates (i.e., fold-changes) are different than 0 (or a different threshold, should you specify something else...though that's not common).

ADD COMMENT • link updated 3.6 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 104k

3

Entering edit mode

Thanks Devon. For LRT is that ok?

ddsLRT <- DESeq(dds1, test="LRT", reduced= ~ 1)
resLRT <- results(ddsLRT)

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 10.1 years ago by Nicolas Rosewick 11k

1

Entering edit mode

Yup, that's exactly what I'd use.

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 10.1 years ago by Devon Ryan 104k

0

Entering edit mode

ok great ! thanks again. I accept the answer ;)

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 10.1 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Thanks for your answer Ryan. It is really helpful.

May I add some follow up questions please? (should I open a new discussion)

If one does an LRT test with ~condition as a full model and ~1 as a reduced model then the results names for the dds object are: "Intercept", "conditionA_vs_B", "conditionC_vs_B". I think I understood that the pvalues are the same whatever the "name" argument used for results(). But the LogFoldChanges seems to be dependent on the "name" argument. So I was wondering:

-How can one compare condition A versus C, as it is not within the results names?

-My understanding is that if one wants pairwise comparisons then Wald tests should be used. So what are those LogFold Changes and how can they be used?

Many thanks

ADD REPLY • link updated 3.6 years ago by Ram 44k • written 9.5 years ago by Aurelie MLB ▴ 360

0

Entering edit mode

There is a lot of information for you in ?results

See the 'test' argument and the 'contrast' argument descriptions for example. And see the section On p-values under Details.

In short: LRT gives you one p-value for there being any difference across the 3 levels. If you want to do pairwise comparisons, use 'contrast' and test="Wald". You can run Wald tests simply using results() if you are using version >= 1.6.

ADD REPLY • link updated 3.6 years ago by Ram 44k • written 9.5 years ago by Michael Love ★ 2.6k

1

Entering edit mode

Hello,

Thanks a lot for your answer. My understanding from the ?results is that:

If I have a "condition" factor such as c(A,B,C) and if I use a full model ~condition and a reduced model ~1 (with test=LRT), I will generate results with the following names: Intercept, condition_B_vs_A, condition_C_vs_A. Then I do understand that I can compare B versus A by doing:

results(dds, name=" condition_B_vs_A" , test="Wald")

and C versus A by doing:

results(dds, name=" condition_C_vs_A" , test="Wald")

What I really am not sure about is how to compare B and C with the results name I have. I do see that I would have to use "contrast". But my question is beyond DESeq2 I think, it is the formulas and constrasts in general that I am having difficulties with. A wild guess would be:

results(dds, constrast=c(0,1,-1) , test="Wald")

because (B-A)-(C-A) = B-C. Does it make any sense please?

ADD REPLY • link updated 3.6 years ago by Ram 44k • written 9.5 years ago by Aurelie MLB ▴ 360

0

Entering edit mode

Hi Aurelie,

you can modify the condition to c_A, b_B, a_C instead of A, B, C. This time, you can have condition_b_B_vs_a_C.

Aifu

ADD REPLY • link updated 3.6 years ago by Ram 44k • written 6.3 years ago by afli ▴ 190