Hello,
I've been using edgeR for a while but have a question for which I haven't been able to figure out the best way to deal with this kind of design.
I have 8 Libraries
Lib01 - Control, Genotype Sensitive [CS]
Lib02 - Control, Genotype Sensitive [CS] (Lib01 replicate)
Lib03 - Stress, Genotype Sensitive [SS]
Lib04 - Stress, Genotype Sensitive [SS] (Lib03 replicate)
Lib05 - Control, Genotype Tolerant [CT]
Lib06 - Control, Genotype Tolerant [CT] (Lib05 replicate)
Lib07 - Stress, Genotype Tolerant [ST]
Lib08 - Stress, Genotype Tolerant [ST] (Lib07 replicate)
genotype <-factor(c("S","S","S","S","T","T","T","T"))
condition <-factor(c("C","C","S","S","C","C","S","S"))
In order to calculate the DE genes in the Sensitive genotype I would do the following:
group<-(paste0(type,genotype))
group #"CS" "CS" "SS" "SS" "CT" "CT" "ST" "ST"
design<-model.matrix(~0+group)
design
groupCS groupCT groupSS groupST
1 1 0 0 0
2 1 0 0 0
3 0 0 1 0
4 0 0 1 0
5 0 1 0 0
6 0 1 0 0
7 0 0 0 1
8 0 0 0 1
fit <- glmFit(y,design)
##Contrasts##
#SSvsCS="SS-CS" Stress vs Control in the Sensitive strain
#STvsCT="ST-CT" Stress vs Control in the Tolerant strain
my.contrasts <- makeContrasts(SSvsCS=SS-CS,STvsCT=ST-CT levels=design)
lrt.SSvsCS <- glmLRT(fit, contrast=my.contrasts[,"SSvsCS"])
topTags(lrt.SSvsCS)
However how is this done to compare the DE between genotypes? Do I simply create a contrast SS-ST ? Shouldn't I consider gene DE if they are also DE against the Control?
How should the design matrix be prepared to compare genotypes taking into account the controls?
I'm not sure I follow. There is an example of a 2x2 ANOVA in the limma User's guide. Could you explain why
model.matrix(~ genotype * condition)
does not do what you want, please.Thanks for your answer.
Your suggestion gives me this design matrix
However that would get me
Which contrast would I use Intercept-genotypeT-genotypeT?
I'm not sure I agree with Devon re there being no need for contrasts in this design.
The two contrasts that you have mentioned in your post are as follows:
SSvsCS="SS-CS" # Stress vs Control in the Sensitive strain
and
STvsCT="ST-CT" # Stress vs Control in the Tolerant strain
(I'd probably throw in an interaction contrast as well, but we'll do the basics first).
Based on the design matrix you've just posted, the fitted value for
i) CS would be CS.fitted = (Intercept)
ii) SS would be SS.fitted = (Intercept) + typeS
iii) CT would be CT.fitted = (Intercept) + genotypeT
iv) ST would be ST.fitted = (Intercept) + genotypeT + typeS + genotypeT:typeS
where the right hand sides are constructed from the fitted coefficients for Intercept, typeS, genotypeT and the genotypeT:typeS
So your SSvsCS contrast would test whether SS.fitted - CS.fitted is nonzero, that is if the coef for 'typeS' is nonzero.
And your STvsCT contrast would test whether ST.fitted - CT.fitted is nonzero, that is if the coef sum 'typeS + genotypeT:typeS' is nonzero
The interaction term that I mentioned tests whether stress has a different effect in the tolerant strain than the sensitive strain; it is given by (ST.fitted - CT.fitted) - (SS.fitted - CS-fitted) = (typeS + genotypeT:typeS) - (typeS) = genotypeT:typeS
I suspect that they really want the factorial design, even if they think they want within group comparisons. Inevitably, people start doing within group comparisons and then do stupid things like comparing lists of DE genes, rather than directly looking at interaction terms in factorial designs. But yeah, if they really have a good reason to do within group comparisons then they should just drop the intercept and use contrasts.
Hi Devon,
I want to use two factorial model (two genotypes and two level of treatments) for analysis using DESEQ2. I want to see the interaction terms and main effects separately. I want to check if the gene expression is affected by treatment only or if the gene expression is effected by genotype only or if the gene expression is affected by treatment in genotype dependent manner. Could you suggest me how could I see them? It looks like I cannot see the p-value of interaction or main effects separately like in ANOVA. In such case how could we know if the interaction has significant effects or we need to consider only main effects separately?
Thanks
You don't need a contrast with that design, the design itself directly answers the question.