Question

Multiple exposure timelines (time-points) in one DESeq2 object or multiple DESeq objects

0

Entering edit mode

2.4 years ago

salman_96 ▴ 70

Hi, I am working on a study where I have control vs drug exposed samples to compare. An important part of the study design is the presence of different time points (controls at one day vs drug exposure at one day, controls at4 days vs drug exposure at 4 days etc.) The metadata info looks something like this below.

time_days<-(factor(c(10,10,10,10,10,10,4,4,4,4,4,4,1,1,1,1,1,1)))
coldata <- data.frame(dose,
                  condition = factor(c(
                    rep("control", 3),
                    rep("low", 3),
                    rep("control", 3),
                    rep("low", 3),
                    rep("control", 3),
                    rep("low", 3))),time_days)
coldata$Groups_of_Interest <- paste(coldata$condition,coldata$time_days,sep = "_")
coldata

I want suggestion on should I keep all samples to be compared together or split them based on time of exposure. In other words, should there be only one DESeq2 object for all or multiple DESeq2 objects for samples based on time points. In general, I do understand that splitting the DESeq2 object may compromise the normalization step.

Is there any suggestions please to look into?

Best regards

time points exposure drug DEGs multiple DESEQ2 • 1.2k views

ADD COMMENT • link updated 2.4 years ago by i.sudbery 20k • written 2.4 years ago by salman_96 ▴ 70

ATpoint · Answer 1 · 2022-07-04

1

Entering edit mode

2.4 years ago

Rafael Soler ★ 1.3k

The dispersion estimates will change depending on whether you analyze the data together or separately, so the best strategy depends on the dispersion between samples. If some groups have greater dispersion than others, the groups with greater dispersion will affect those with less difference, so the best option will be to analyze them separately. If all the groups have a similar spread, it would be best to keep the data together. You can use the PCA plot with plotPCA to see if the dispersion between samples is small or large.

See also these posts as reference:

Dispersion estimation using DESeq2

DESeq2 with multiple variable give me different results

Best,

Rafa

ADD COMMENT • link updated 2.4 years ago by ATpoint 85k • written 2.4 years ago by Rafael Soler ★ 1.3k

1

Entering edit mode

+1, fyi you can simply post a plain biostars link to an existing thread/answer, it will then display the title of the thread, there is no need to embed links manually

ADD REPLY • link 2.4 years ago by ATpoint 85k

1

Entering edit mode

Oh, I see. Thank you :)

ADD REPLY • link 2.4 years ago by Rafael Soler ★ 1.3k

score 1 · Answer 2 · 2022-07-05

This depends on what question you are trying to answer.

If your questions is "Which genes vary between treated and untreated at each time point", then follow the advice set out above by @rafaelsoler9.

However, if you question is "Can I find genes where the expression time course is altered by treatment?", then you want to analyze them together with an interaction design.

You would use an LRT to compare the full model: ~ time_days + condition + condition:time_days

to the reduced model:

~time_days + condition

This will find genes whose timecourse is different between treatments.