I am using limma to test for differential expression. And I am trying to find the best model to use. My variable of interest is condition. I have several batch variables plate, well and gender which seems to be highly associated with top three principal components in a PCA.
I have found this function in limma, selectModel()
which should help me to pick the best fitting model.
d1 <- model.matrix(~0+condition,data=md1)
d2 <- model.matrix(~0+condition+gender,data=md1)
d3 <- model.matrix(~0+condition+plate,data=md1)
d4 <- model.matrix(~0+condition+well,data=md1)
d5 <- model.matrix(~0+condition+plate+well,data=md1)
d6 <- model.matrix(~0+condition+gender+plate,data=md1)
d7 <- model.matrix(~0+condition+gender+well,data=md1)
d8 <- model.matrix(~0+condition+gender+plate+well,data=md1)
dlist <- list(d1,d2,d3,d4,d5,d6,d7,d8)
sm1 <- selectModel(dfr,dlist,criterion="aic")
sm2 <- selectModel(dfr,dlist,criterion="bic")
Here are barplots summarising the best models as picked by selectModel()
using AIC and BIC criterion respectively.
barplot(table(sm1$pref))
barplot(table(sm2$pref))
X axis shows the model number and y axis shows the number of probes for which a model was picked as the best. Both the AIC and BIC metrics seem to be differing in opinion. What would you say is the best model here? or am I doing this wrong?