I have an experiment that has 2 groups to compare (treated vs control), and for each group I have 12 time points, with 7 patients per time point/group. I am wondering what is the best way to approach the following questions:
- Does treatment have an effect? If so, at which time point?
- Are there striking time effects regardless of group?
To answer these questions I've specified my design to be given by ~0 + treatment + time + patient + group:time
. I have included patient ID as the same patient is sampled at multiple times, and in both treated and control situations.
Now let's say that the 12 time points are split in 2 days, day 1 and 2, but I'm only interested (for now) in answering the questions above for day 1. At a later point, I will still examine those of day 2. I can see 2 ways of proceeding:
- Model all treatments with a single design, specifying all contrasts (including any for day 2), but then extracting only the comparisons for day 1.
- Subsetting all data and contrasts to include only those in day 1, and add those for day 2 at a later point.
In approach 1, I solve multiple problems at once, but the linear model will be controlling for the effects at day 2, which is not the most pressing question right now (but I will still want to answer later). This in turn has the advantage of my being able to extract all contrasts of interest in a single pass, and it is possible that any effect in day 2 is influenced by what happened in day 1.
In approach 2, we examine everything separately. But this does not account for the possible effect that day 1 may have on the samples of day 2.
TLDR: I'm just wondering if you see any disadvantage in doing approach 1, even if I only want to examine day 1 and maybe later day 2.
I typically prefer 1) because if you split samples and later decide that you need contrasts that are only possible when including all samples then your results (different normalization and dispersion when adding/removing samples) will change or you end up with multiple analysis objects and need to go back and forth between them...tedious.
@ATpoint, I do agree with you, but am nevertheless concerned about the FDR. Let's say that you do want to tell this story in 2 parts: first focusing on day 1, and then on day 2. For the first situation, you will be penalizing P-values by the large number of tests included in day 2 as well.
...well, I guess you can always do it in a single pass, and then depending on what you want to examine, correct p-values accordingly.
I see your point but I do not think this is what happens. FDR is calculated on a contrast-wise basis, so within each contrast on the nominal p-values. It will be the same whether you test one or one-hundred contrasts within each contrast.
Perfect, thanks a lot for all the clarifications, I really appreciate it