Hi,
We are trying to answer if the genetic effects of brood and queen on bees' workers are similar or distinct. We have four experimental conditions (treatments): workers in the presence of a queen (CQ), workers in the presence of brood (CB), workers in the presence of both the queen and brood (CBQ), and workers without a queen or brood (C - control). There are six replicates for each treatment. This is how the data looks like:
ID Treatment Control Queen Brood Colony
5B CB 0 0 1 45
7B C 1 0 0 47
8B C 1 0 0 47
9B CB 0 0 1 47
10B CB 0 0 1 47
15B CB 0 0 1 45
16B CQ 0 1 0 46
18B C 1 0 0 45
19B CB 0 0 1 46
22B CQ 0 1 0 47
24B C 1 0 0 45
26B CB 0 0 1 46
27B C 1 0 0 46
30B C 1 0 0 46
33B CBQ 0 1 1 45
37B CQ 0 1 0 45
38B CQ 0 1 0 47
39B CBQ 0 1 1 47
41B CBQ 0 1 1 45
42B CBQ 0 1 1 46
43B CQ 0 1 0 46
45B CBQ 0 1 1 47
48B CBQ 0 1 1 46
49B CQ 0 1 0 45
I first analyzed the data performing a pairwise comparison among the treatments, controlling for the colony identity (using DESeq2). This approach was criticized by the reviewers and the paper was denied for publication. One of the reviewers suggested using the LRT model comparison function of Deseq2 to test the main effects of Queen presence, Brood presence, as well as their interaction.
When I try to use the variables "Control", "Queen" and, "Brood" rather than treatment, I get an error if I include the interaction between Queen and Brood. This is the code I am trying to run:
dds <- DESeqDataSetFromTximport(txi.rsem, colData = doe, design = ~ Colony + Control + Queen + Brood + Queen:Brood)
Error in checkFullRank(modelMatrix) :
the model matrix is not full rank, so the model cannot be fit as specified.
One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
I have read about this error in the DESeq2 vignette and also a lot of similar posts. But it is still not clear for me if I can't do it with the kind of data that I have, or if there is another way to design the matrix and actually get the results from this interaction. I mean, not using 0 and 1 to represents queen or brood presence/absence, for example.
Next steps would be:
dds <- DESeq(dds, test = "LRT", reduced = ~ Colony + Control) #to get the effect of Queen:Brood interaction
dds <- DESeq(dds, test = "LRT", reduced = ~ Colony + Control + Brood ) #to get the effect of only the queen
dds <- DESeq(dds, test = "LRT", reduced = ~ Colony + Control + Queen) #to get the effect of only the brood
I appreciate any helpful comments. Thank you, Priscila
Thank you very much Carlo. It worked when I removed the Control. Would you mind explaining why you don't think I should use LRT test?
LRT is superior to the Wald test when you want to test multiple terms at once, but I don't think it is the case here. So for me, it would be much more straightforward to test directly the Queen effect, the Brood effect and their interaction with Wald tests. IMHO, coefficients (log2FC) are easier to interpret that way, but maybe this comes to personal preferences at this point.
At the end of the day, if the reviewer wants you to do LRT, you might as well do it. The results should be very similar to the Wald tests anyway.
How to run LRT on single factor for example if I only have different time points six different time point, when I run the reduced model which one becomes my reference in this case , my conditions are designed as
C1,C2,C3,C4,C5 and C6
respectively.My idea is to test each of the time point against each other for example C1 vs The rest , C2 vs the rest . So would be correct to set a reference level ?