I am running EdgeR to find differentially expressed genes and I want to make sure I am constructing the model and contrast correctly.
I have 9 total samples, 3 experimental samples (ExpA, ExpB, ExpC) and 6 controls samples (CtrlA1, CtrlA2, CtrlB1, CtrlB2, CtrlC1, CtrlC2). The samples with the same letter (ExpA, CtrlA1, CtrlA2 for example) were collected on the same day - thus the collections occurred on three different days. I need to find the genes that are differentially expressed between the experimental samples and the paired control samples but want to take into account the paired nature of the samples.
I have set up a group in EdgeR to use Exp/Ctrl and Day as grouping factors:
group <- factor(paste(targets$Experiment, targets$Day, sep = "."))
That yields six levels: Ctrl.A, Ctrl.B, Ctrl.C, Exp.A, Exp.B, Exp.C
I then use this to construct the design:
design <- model.matrix(~0+group)
That yields a design matrix that I use to perform the estimations and fit. Then to set up the differential comparison test, I use the following contrast:
my.contrasts <- makeContrasts(DipvsCtrl = (Exp.A-Ctrl.A)+(Exp.B-Ctrl.B)+(Exp.C-Ctrl.C), levels = design)
Quickly comparing the counts data seems to suggest this is the correct contrast to make, but I just wanted to see if anyone else could notice something I missed. Did I set up the contrast correctly to make the comparison I want to make?
Thanks!
Thanks! So if I understand, this will maintain the days in the comparison right? It is best to compare samples collected on the same day (ExpA vs CtrlA). I just want to make sure I'm understanding it correctly.
Yes you are testing "exp" vs "ctrl" with this design, but with the "day" effect taken into account (think of it as a similar approach to a paired t-test).
Thank you very much! That makes sense to me and I really appreciate the help!