I do not have any question as such but this post I am putting up to get general views from other statistician or analysts those who have experience in analyzing time course RNA-seq data. It would be great if anybody can share there views or challenges they have experienced while analyzing such a data or thoughts on experiment design.
Just mention another very important issue: what tools will be used to analyze your time series RNA-Seq data? Whether an experiment design is proper depends on our biological question. However, if no tool can fit our experimental design, our biological question cannot be answered just by current tools. Based on my experience, STEM (Ernst & Bar-Joseph, 2006 [ref1]), GATE (MacArthur et al., 2010 [ref2]), and DREM (Schulz et al., 2012 [ref3]) have been used to analyze our time series RNA-Seq data. All of them have different underlying principal and particular limitation. There are other time series RNA-Seq tools in Omic Tools for your reference: Home > Sequencing > RNA-seq > Time course and Home>Microarray>Gene expression array>Time course (http://omictools.com).
Reference
ref1: Ernst, J. & Bar-Joseph, Z. (2006). STEM: a tool for the analysis of short time series gene expression data. BMC Bioinformatics7.
MacArthur, B.D., Lachmann, A., Lemischka, I.R. & Ma'ayan, A. (2010). GATE: software for the analysis and visualization of high-dimensional time series expression data. Bioinformatics26, 143-144.
ref3: Schulz, M.H., Devanny, W.E., Gitter, A., Zhong, S., Ernst, J. & Bar-Joseph, Z. (2012). DREM 2.0: Improved reconstruction of dynamic regulatory networks from time-series expression data. BMC Syst. Biol.6.
ADD COMMENT
• link
updated 2.5 years ago by
Ram
44k
•
written 9.7 years ago by
Gary
▴
480
Based on my experience, there are several things that might come to one's experimental design and data analysis.
How many replicates per time point?
How many time points, and how do you distribute the time points along the course? One may allocate denser time points within a very interesting time window, and the time points need not to be evenly distributed.
Try to squeeze all your samples onto one chip for sequencing to avoid batch effect. So know your sample size limit. (If can't, then depending on how you want to compare the data, plan the chip so that batch effect will not be a concern.)
Try to perform all your replicated experiments at the same time, using the same equipments, and with the same hands, to avoid batch effect.
When planning the time points, take into account how long does it need to extract the samples. Don't make two time points too close.
Sometimes samples drawn from different time points can have very different transcriptome profiles. Be aware when you do normalisation if this is the case, since many of the normalisation methods in those popular pipelines assume not many differentially expressed genes exist between samples.
ADD COMMENT
• link
updated 2.5 years ago by
Ram
44k
•
written 9.7 years ago by
jing
▴
10
From my (only recently acquired) experience I can say that it helps to have a solid understanding of statistics and Bioinformatics. Furthermore don't do the mistake of an experiment without replicates because you need to estimate the variation within your samples.
Other than that I can only recommend you to read some papers to see whats possible etc.
I am also doing some RNA-seq DE analysis at the moment. so I can tell it's not that easy.
ADD COMMENT
• link
updated 2.5 years ago by
Ram
44k
•
written 9.7 years ago by
john
▴
30
Experimental design plays a very important role in getting high confident results. RNA Sequencing experimental design is generally effected by two parameters:
a) Technical Replicates (Sequencing Depth): Helps to remove false positives.
b) Biological Replicates: Helps to remove false negatives.
The number of differentially expressed genes are also effected based on the number of replicates (both Technical and biological Replicates).