Hi there. I am currently trying to use DESeq2 analysis on some samples. I have completed a basic analysis, but now I have a much more complicated analysis I need to do. I am looking at a lot of videos and examples, but I still don't quite understand how to write the R-script/code for what it is I want to do. I will try and give a detailed layout of my study.
I am studying the effect of a tomato virus on 2 plant lines, Resistant and susceptible. For each plant line I have 2 groups, infected and control, and I have 2 time points, 15 days post infection and 35 days post infection (dpi), and this also counts for the control group (they were inoculated with an empty plasmid to account for biological reactions to plant damage as well, to keep as many factors constant).
Now, I want to see how the genes were differentially expressed:
- for each time point of each plant line (15dpi vs control and so on)
- between the time points within each plant line (15dpi vs 35 dpi)
- between the two plant lines (15dpi resistant vs 15dpi susceptible and 35dpi resistant vs 35dpi susceptible)
Now I know that this means I am looking at different levels. But now my question is, do I use the contrast() or how do I approach this.
Here is my sample information: (3 biological replicates for control, 4 biological replicates for infected)
Resistant
Sample Treatment Time point Plant Line
RVR1:15 Infected 15 Resistant
RVR2:15 Infected 15 Resistant
RVR3:15 Infected 15 Resistant
RVR4:15 Infected 15 Resistant
RVR1:35 Infected 35 Resistant
RVR2:35 Infected 35 Resistant
RVR3:35 Infected 35 Resistant
RVR4:35 Infected 35 Resistant
RCR1:15 Control 15 Resistant
RCR2:15 Control 15 Resistant
RCR3:15 Control 15 Resistant
RCR1:35 Control 35 Resistant
RCR2:35 Control 35 Resistant
RCR3:35 Control 35 Resistant
Susceptible
Sample Treatment Time point Plant Line
SVR1:15 Infected 15 Susceptible
SVR2:15 Infected 15 Susceptible
SVR3:15 Infected 15 Susceptible
SVR4:15 Infected 15 Susceptible
SVR1:35 Infected 35 Susceptible
SVR2:35 Infected 35 Susceptible
SVR3:35 Infected 35 Susceptible
SVR4:35 Infected 35 Susceptible
SCR1:15 Control 15 Susceptible
SCR2:15 Control 15 Susceptible
SCR3:15 Control 15 Susceptible
SCR1:35 Control 35 Susceptible
SCR2:35 Control 35 Susceptible
SCR3:35 Control 35 Susceptible
So how will I write the multifactor design for these groups/levels? I would really appreciate assistance on this, I am so lost and confused. Thank you.
Thank you for the response. Would it look something like this?
Do I need to add batch information?
Yes, like that. If you have batch information, and it doesn't confound with your experimental groups, you can just add it as a column, and make your design
~ Batch + Condition
. (Order doesn't matter when you specify the desired contrasts in theresults
call.)There are other ways to compare one subgroup to another, using designs with interactions, but this way is pretty much equivalent, and way, way more readable.
Thank you for the feedback. I appreciate the assistance. One final question, do I have to label each replicate with a sub. number, i.e susceptible_control_35_A? Or will DESeq2 recognize the biological replicates?
Did you look at the DESeq vignette? Do they include replicate numbers in their colData? DESeq will take all the samples with the group name you specify, and compare them to all the samples with the second group name you specify. If you include replicate numbers in there, every sample will have its own group. You don't want that.
The titles are the same as the second table I sent through (susceptible_control_35). So I should leave them as they are. How do you specify which group goes first ad which second? I am sorry if that's a stupid question, but I am still trying to figure this all out
Did you look at the vignette?
I am doing it by myself, I am setting up the DESeq2. I have tried reading the manual for DESeq2 and looking at examples. The sample data and counts data are what I put together. I hope that makes sense.
This is what I have for resistant 15dp infected vs resistant 15dpi control
What is really confusing is the fact that each time point has its own control. So if I wanted to compare resistant 15dpi infected with susceptible 15dpi infected, which control will I use, the resistant control or the susceptible control. I don't think it's possible then to compare the resistant with the susceptible.