Hi,
I'm analysing a microarray with 59 samples distributed across the following groups and timepoints:
SampleName Group Batch Timepoint
sample1 control_30 1 30
sample2 control_30 2 30
sample3 control_30 2 30
sample4 control_30 3 30
sample5 control_30 4 30
sample6 control_30 4 30
sample7 control_30 4 30
sample8 control_30 4 30
sample9 treat1_30 1 30
sample10 treat1_30 1 30
sample11 treat1_30 2 30
sample12 treat1_30 2 30
sample13 treat2_30 1 30
sample14 treat2_30 1 30
sample15 treat2_30 2 30
sample16 treat2_30 2 30
sample17 treat2_30 3 30
sample18 treat3_30 1 30
sample19 treat3_30 1 30
sample20 treat3_30 2 30
sample21 treat3_30 2 30
sample22 control_60 5 60
sample23 control_60 5 60
sample24 control_60 6 60
sample25 control_60 6 60
sample26 treat1_60 5 60
sample27 treat1_60 5 60
sample28 treat1_60 6 60
sample29 treat1_60 6 60
sample30 treat2_60 5 60
sample31 treat2_60 5 60
sample32 treat2_60 6 60
sample33 treat2_60 6 60
sample34 treat2_60 3 60
sample35 treat3_60 5 60
sample36 treat3_60 5 60
sample37 treat3_60 6 60
sample38 treat3_60 6 60
sample39 control_90 7 90
sample40 control_90 7 90
sample41 control_90 8 90
sample42 control_90 8 90
sample43 treat1_90 3 90
sample44 treat1_90 3 90
sample45 treat1_90 3 90
sample46 treat1_90 3 90
sample47 treat2_90 7 90
sample48 treat2_90 7 90
sample49 treat2_90 8 90
sample50 treat2_90 8 90
sample51 treat3_90 7 90
sample52 treat3_90 7 90
sample53 treat3_90 8 90
sample54 treat3_90 8 90
sample55 treat3_90 3 90
sample56 treat4_90 7 90
sample57 treat4_90 7 90
sample58 treat4_90 8 90
sample59 treat4_90 8 90
We're interested in evaluating the differences between each treatment VS. control within each timepoint, so each treatN
in timepoint N
is compared against the respective control_N
.
My current code and design is:
# Design matrix
batch <- factor(y$targets$Batch)
group <- factor(y$targets$Group)
design <- model.matrix(~0 + batch + group)
colnames(design) <- gsub("group", "", colnames(design))
colnames(design) <- gsub("batch ", "", colnames(design))
contr.matrix <- makeContrasts(
# timepoint 30
Cont.VS.Treat1.30dpi = treat1_30-control_30,
Cont.VS.Treat2.30dpi = treat2_30-control_30,
Cont.VS.Treat3.30dpi = treat3_30-control_30,
# timepoint 60
Cont.VS.Treat1.60dpi = treat1_60-control_60,
Cont.VS.Treat2.60dpi = treat2_60-control_60,
Cont.VS.Treat3.60dpi = treat3_60-control_60,
# timepoint 90
Cont.VS.Treat1.90dpi = treat1_90-control_90,
Cont.VS.Treat2.90dpi = treat2_90-control_90,
Cont.VS.Treat3.90dpi = treat3_90-control_90,
levels = colnames(design))
fit <- lmFit(y, design)
fit2 <- contrasts.fit(fit, contr.matrix)
fit2 <- eBayes(fit2,trend=TRUE,robust=TRUE)
Then, to extract the DEGs between treat1
and control
in timepoint 30 while accounting for batch effects I get the topTable
from the relevant coefficient, as follows:
topTable(fit2, coef = 1, adjust.method = "BH", n=5, sort.by = "P")
While for treat2
and control
in timepoint 30 would be:
topTable(fit2, coef = 2, adjust.method = "BH", n=5, sort.by = "P")
etc.
I would like to assert whether this is the correct way to construct my design matrix and contrasts as I'm getting very different number of DEGs from a previous analysis performed by someone else in our group with the same dataset. Help will be greatly appreciated.
thank you!
Ivàn
Anyone willing to input on this? Thank you! :-)