Hello,
I am analysing a dataset that looks at the effects of mating and diet on Drosophila. My design is a 2x2 design where I have:
1) Virgin on Protein 2) Virgin on Carbohydrate 3) Mated on Protein 4) Mated on Carbohydrate
I first looked at changes in diet genes in virgin and mated flies. The contrast I used was going from carb > protein environment in virgin and mated flies separately.
- Virgin DE genes = 101
- Mated DE genes = 864
As we can see there are way more genes that respond in the mated flies than in the virgin flies.
I want to generate a model that integrates all 4 treatments (~ diet + mating +diet:mating). I am worried that given the difference in magnitude in the responses to diet (with mated flies having 8x more genes that respond to diet). Does this mean that I'm going to get a lot of false positives?
Would it be better to stick with individual comparisons, or is it OK to generate overall models?
Any advice would be greatly appreciated.
The interaction term in your regression formula (diet:mating) should be sufficient to account for mating status effecting the response of diet on gene expression.