Question

Calling differential gene expression in Time series

1

Entering edit mode

9.9 years ago

Opt ▴ 50

Hi,

Is there a tool to call the differential expression of a gene in a time series that also tries to tell at which time point the differential expression started?

Thanks

sequencing gene-expression RNA-Seq • 5.9k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.9 years ago by Opt ▴ 50

Ram · Answer 1 · 2014-12-27

Given your wording, the simplest solution would be to use edgeR, DESeq2, or limma (put the data through voom() first) and just specify the timepoints as unordered factors (rather than numeric time points). This will allow for non-linear trends and, thus, differences due to changes at a single time point. Something like the following would work as a design (you could get more complicated, of course):

df <- data.frame(timepoints=factor(c(sprintf("tp%i", c(rep(1,3),rep(2,3), rep(5,3), rep(10,3)))))
mm <- model.matrix(~0+timepoints, df) #Or keep the intercept with DESeq2

Using "0+" is better for edgeR and limma where otherwise the contrasts would be strange. Before you ask, no, the negative-binomial methods don't have post-hoc tests like a 1-way ANOVA (no one had come up with a way to do that last I checked).

Ram · Answer 2 · 2015-09-21

I recommend you reading the review paper [1]. Also the EBSeq-HMM [2] package, which can identify the DEGs considering the time dependency and further cluster the DEGs according to their expression trend. You can easily filter the DEGs and trend cluster according to the FDR and PP value.

[1] Bar-Joseph, Z., et al. (2012). "Studying and modelling dynamic biological processes using time-series gene expression data." Nat Rev Genet 13(8): 552-564.
[2] Leng, N., et al. (2015). "EBSeq-HMM: a Bayesian approach for identifying gene-expression changes in ordered RNA-seq experiments." Bioinformatics.

Ram · Answer 3 · 2014-12-29

If I understand the question correctly, I think this is tricky since it requires defining two different sets of patterns in the data. For example, I am assuming that you would be interested in something like this:

Time:          T1     T2    T3     T4     T5     T6
Expression:    1.1    1.0   0.9    5.3    5.1.   4.8

In this case, I guess the time point you are interested in is T4.

Also, the "start" point is relative to measurements. For example, consider cyclic expression that follows a cosine function: here, the shape is informative but defining an initially interesting time point is somewhat artificial.

In general, if your goal is to find interesting shapes across the time point, then I would check out this discussion: Rna-Seq Time Course Data

If you have a biological reason to be interested in a particular time point, you could define early and late time points as separate groups. For example, defining genes that vary between T1/T2/T3 and T4/T5/T6 should identify the hypothetical gene that I described above.

On the other hand, if you are interested in changes at a single time point, then I think Devon gave a good answer.