I am analysing the following experiment: two cell lines representing the same disease subtype were treated with a drug or DMSO (control) and the samples were collected in 4 time points: 0.5, 1, 6, and 16 h. Two types of sequencing were performed for each sample: Ribo-Seq to measure translatome and RNA-Seq to measure transcriptome. If you prefer a table view, see below for the full experiment table.
Now, I would like to test for any differences between drug and DMSO over multiple time points, that is specific to Ribo-Seq (occurs mainly at the translation level).
According to the DESeq2 manual for RNA-Seq data analysis the simplest testing for any differences over multiple time points looks like:
dds <- DESeqDataSetFromMatrix(counts, ~ condition + time + condition:time)
dds <- DESeq(dds, test = "LRT", reduced = ~ condition + time
After reading many posts at this forum regarding the DESeq2 experiment design I assume that the design matrix that takes into account two types of sequencing and pairwise comparisons within the same cell line should be:
dds <- DESeqDataSetFromMatrix(counts, ~ experiment + cell_line + condition + time + experiment:condition + condition:time)
dds <- DESeq(dds, test = "LRT", reduced = ~ experiment + condition + time + experiment:condition
Is it a correct reasoning?
Tabular view of the experiment:
Experiment Cell_line Condition Time (h)
_______________________________________
Ribo_seq A Drug 0.5
Ribo_seq A Drug 1
Ribo_seq A Drug 6
Ribo_seq A Drug 16
Ribo_seq B Drug 0.5
Ribo_seq B Drug 1
Ribo_seq B Drug 6
Ribo_seq B Drug 16
Ribo_seq A Control 0.5
Ribo_seq A Control 1
Ribo_seq A Control 6
Ribo_seq A Control 16
Ribo_seq B Control 0.5
Ribo_seq B Control 1
Ribo_seq B Control 6
Ribo_seq B Control 16
RNA_seq A Drug 0.5
RNA_seq A Drug 1
RNA_seq A Drug 6
RNA_seq A Drug 16
RNA_seq B Drug 0.5
RNA_seq B Drug 1
RNA_seq B Drug 6
RNA_seq B Drug 16
RNA_seq A Control 0.5
RNA_seq A Control 1
RNA_seq A Control 6
RNA_seq A Control 16
RNA_seq B Control 0.5
RNA_seq B Control 1
RNA_seq B Control 6
RNA_seq B Control 16
I'm not sure it's wise to mix ribo-seq and RNA-seq in the same DESeq2 analysis like this. They have such different biases that I worry they'll muck up a lot of the fitting and variance shrinkage. Usually you'd combine these two further downstream in an analysis.
Thank you for such a quick reply. This is a good point. Usually with a simple experiment design I do separate analyses for RNA-Seq and Ribo-Seq experiment, which in that case should look like this:
Then I just simply take the genes that are significant only in the Ribo-Seq comparison. However, I am concerned that actually my final list of differentially translated genes is not a result of any statistical testing. Just intersection of two comparisons without control of false positives/negatives rate.
The ribo-seq is directly telling you about changes in translation, the RNA-seq is giving orthogonal information related to whether such changes are purely expression related. Given that, they really should be analysed separately. I wouldn't use set overlaps between them, I don't think that'd be useful.
Yes I agree. The problem is that translation changes are highly correlated with transcript abundance and we don't know what is the "normal" translational response to changing mRNA abundance. The safe option seems to be define translationally regulated genes as genes that shows changes only in the Ribo-Seq experiment. On the other hand, Ribo-Seq experiments have usually much lower library size than paired RNA-Seq, so larger spread of changes in Ribo-Seq may be attributable just to sampling error. It is quite difficult to find an optimal strategy to analyse such experiments.
I have a feeling using an analysis like DEXseq would be useful in this regard. You might be doing the same thing with this model (i.e., controlling for expression changes), I'd have to think about it a bit more.
Hi, these two workflows suggest that it's okay to mix ribo-seq and RNA-seq in the same differential expression analysis 1) http://girke.bioinformatics.ucr.edu/systemPipeR/pages/mydoc/systemPipeRIBOseq.html#11_differential_ribosome_loading_analysis_(translational_efficiency)
2) https://plastid.readthedocs.io/en/latest/examples/gene_expression.html#estimating-translation-efficiency
At the end we used DESeq2 for Ribo-Seq and RNA-Seq separately as you suggested. Then we used change in translation efficiency (TE) log2(normalised expression measured by Ribo-Seq/normalised expression measured by RNA-Seq) to identify translationally regulated genes. We set arbitrary threshold for that. It worked well - we validated the results using mass-spec that was in line with Ribo-seq. I'm a bit worried now about lack of formal testing for significance of change in TE. From our experience with ribo-seq specific tools that test on TE, usually they are biased towards highly or lowly expressed genes or/and can't accommodate with complex experiment design. Therefore I am not too keen to use them. To sum up: 1. Do you think that lack of test for TE is a problem? 2. If so, do you think that a convincing solution is to cluster genes that are differentially expressed in Ribo-Seq according to their pattern of expression in order to identify those alerted predominantly at translation level?
Hi hibernicah I hope you manged your analysis. How did you do RiboSeq analysis? Can you provide me a pipeline? I know alignment and get. Bam file but I don't know which package works smoothly, dose Deseq2 works as in ranseq analysis?
Best regards Bpshtiwan@gmail.com