I have a data , like this
Genotype condition time
A uninfected 0day
A uninfected 0 day
A uninfected 0 day
A mock 2day
A mock 2day
A mock 2 day
A infetced 2day
A infected 2 day
A infected 2 day
A mock 4day
A mock 4 day
A mock 4day
A infected 4day
A infected 4day
A infected 4 day
B uninfected 0 day
B uninfected 0 day
B uninfected 0 day
B mock 2day
B mock 2day
B mock 2day
B infected 2day
B infected 2day
B infected 2day
B mock 4day
B mock 4day
B mock 4day
B infected 4day
B infected 4day
B infected 4day
I have to apply a model such that i can compare the effect of treatment over time of two different genotypes.
I have used the design formula Genotype+condition+time+condition:time+Genotype:time
and then LRT test using Genotype+condition+time
Due to only one uninfected condition in each genotype it is saying
Model matrix is not full rank
1.Which model can I use for this type of data for time series comparision.
2.Can i add uninfected again before 4 day (as this is the same). Will it be statistically correct ?
You have the same problem as discussed at Error in differential analysis for samples with different time points. In short: no biological replicates.
The uninfected samples aren't useful, you can remove them. Also, as you have no replicates you will either have to ignore the effect of Genotype, condition, or time. A better solution would be to sequence replicates (ideally 6 of each condition, but at the minimum 3 of each).
Thank you for the reply Sorry, I forgot to write the point that i have 3 biological replicates for each . I want to compare uninfected with mock and with infected according to my work plan.
Why aren't the uninfected samples useful? They provide a T_0 baseline for the rest of the experiment
yes , we are also taking it as the baseline to compare with mock and infected
They have no corresponding samples with which to compare, so it's unclear if any change is actually due to time or just "doing something" (i.e., mock treatment).
On the contrary, if they didn't have the T_0 sample, their mock-infection placebo could have profound time-dependent effects but they'd have no baseline sample against which to compare those effects and wouldn't be able to identify them. This is a well designed experiment and I really don't understand the criticism
Their baseline for comparison is currently day 2, which it is regardless of whether the uninfected samples are present or not. One can't meaningfully use day 0 as a baseline because it can't be distinguished from the lack of treatment. There was likely a good biological reason for this, but as we're naive to that what I mentioned is the most we can say. This also corresponds to Carlo's answer, where he correctly notes that
mock
anduninfected
can't be compared if we think there's a change simply due to time at day 2.On the contrary, if they didn't have the T_0 sample, their mock-infection placebo could have profound time-dependent effects but they'd have no baseline sample against which to compare those effects and wouldn't be able to identify them This is exactly what we want to know and we have two genotypes which are known to have variation in the levels of genes even in the uninfected level (without any mock and infected). So, considering this we took the uninfected level