Question

Interpreting time-series analysis with DESeq2 output

0

Entering edit mode

2.0 years ago

arshad1292 ▴ 110

Hello all,

I read and followed time course experiment mentioned in DESeq2 vignette

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#time-course-experiments

Please see attached figure which was generated with this model:

LRT p-value: '~ strain + minute + strain:minute' vs '~ strain + minute'

I have also generated a similar figure with my own data but I am having a hard time explaining it to myself.
enter image description here

Here are my questions.

What are these dots represent? Are they genes?
Seems like 3 genes from each mut and wt are being tracked from time 0 to 150. If correct, I can assume only 3 genes are differentially expressed between strains across four time points?
What does lines (blue and red) represent? Do they show the mean counts of three genes in each group?

Please clarify it for me and many thanks in advance!

time-series DESeq2 • 1.6k views

ADD COMMENT • link 2.0 years ago by arshad1292 ▴ 110

score 0 · Answer 1 · 2022-11-09

0

Entering edit mode

2.0 years ago

swbarnes2 14k

I don't know how you think we would know what you graphed, but if I had to guess, I'd say it was 1 gene, because you have replicates.

ADD COMMENT • link 2.0 years ago by swbarnes2 14k

0

Entering edit mode

Sorry I did not write the whole detail because it's available on DESeq2 vignette. This figure also belongs to that DESeq2 tutorial.

Ok, makes sense if these are the replicates of one gene. Next question is.... DESeq2 randomly selects one genes and draws a figure? How can we find out which gene did it select to draw this figure and if we can change the gene?

Many thanks

ADD REPLY • link 2.0 years ago by arshad1292 ▴ 110

1

Entering edit mode

It does not do anything randomly but does what you tell it. To figure that out please add code.

ADD REPLY • link 2.0 years ago by ATpoint 85k

0

Entering edit mode

Thank you for your message. Here is the code, also available on DESeq2 vignette.

Full model:

ddsTC <- DESeqDataSet(fission, ~ strain + minute + strain:minute)

Reduced model:

ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + minute)

And results:

resTC <- results(ddsTC)

Then for plotting:

fiss <- plotCounts(ddsTC, which.min(resTC$padj), intgroup = c("minute","strain"), returnData = TRUE) fiss$minute <- as.numeric(as.character(fiss$minute))

ggplot(fiss, aes(x = minute, y = count, color = strain, group = strain)) + geom_point() + stat_summary(fun.y=mean, geom="line") + scale_y_log10()

So, the code above plots one gene with the smallest adjusted p value. The dots show replications in each group, and line shows mean value of normalized counts in different groups across various time points, right? My question is that my 5 genes have the same adjusted p value. Which one it would plot? Many thanks

ADD REPLY • link 2.0 years ago by arshad1292 ▴ 110

0

Entering edit mode

Of course it's not picking a random gene. Look at the code. You cannot do a complicated analysis like this by blindly copying and pasting and not understanding what your code is doing.

ADD REPLY • link 2.0 years ago by swbarnes2 14k

0

Entering edit mode

Dear swbarnes2, yes you're right. I am confused about this complicated time series analysis. If you could please help me understand the following questions, it will clarify my concept to a greater extent. In the above example, I have one full model and 4 possible reduced models as shown below:

Full model:

ddsTC <- DESeqDataSet(fission, ~ strain + minute + strain:minute)

Reduced model 1:

ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + minute)

Reduced model 2:

ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ minute)

Reduced model 3:

ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain)

Reduced model 4:

ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain:minute)

So, how can we interpret a design from each reduced model from 1-4? Please correct me if I am wrong.

Reduced model 1 means we are testing the effect of interaction between strain and time while regressing out the individual effect of strain and time? Reduced model 2 means we are testing the effect of strain over time (interaction) while regressing out the effect of time (minute)?

Reduced model 3 means we are testing the effect of time and strain interaction over time while regressing out the individual effect of strain?

Reduced model 4 means we are testing the effect of strain over time (minute) and regressing out the interaction between strain and time?

Sorry, I may be confused on how this complex model works but I think if I understand how to interpret the reduced model, then I will be better able to analyse my time-series data.

Many thanks and sorry for my confusion on this topic.

ADD REPLY • link 2.0 years ago by arshad1292 ▴ 110

0

Entering edit mode

Your posted question is about the meaning of the graph. The model applied is irrelevant to what the graph looks like.

ADD REPLY • link 2.0 years ago by swbarnes2 14k

0

Entering edit mode

Ok, I will post it as a separate question for my understanding. Thank you for your time.

ADD REPLY • link 2.0 years ago by arshad1292 ▴ 110