I am helping a colleague automate some AffyMetrix gene chip analysis in R. This is based on the pipeline described by Gillespie et al 2010 ( http://www.biomedcentral.com/1756-0500/3/81 ).
During the analysis, contrasts will be calculated. The experiment can be comparing mutant-type expression with wild-type expression at each time step (m00-w00, m04-w04, m08-w08, ...) or comparing the timecourse of the single gene type, e.g. only wild-type (w00-w04, w04-w08, w08-w12, ...). In some cases we want both things, i.e. three sets in total: m0x-w0x, m0x-m0y, w0x-w0y.
This needs to be generalised as much as possible so that sensible results are obtained (and graphs outputted) with minimal user input.
My colleague has specifically said that he is interested in all contrasts, for example: w00-w04, w04-w08, and w00-w08. Does this make sense? Is it useful? I am not a biologist - my area is mathematics. Including the full range of contrasts causes trouble for the script.
Intuitively it seems that having e.g. w00-w04, w04-08 and w00-w08 is overkill since the third contrast can be derived from summing the first two: linear dependence. Am I right in believing that contrasts should be orthogonal to avoid redundancy?
Any advice on this? Thanks!
By the way, currently I generate the full set of contrasts as follows:
paste2 <- function(...) paste(...,sep="-")
X <- c("m00", "m04", "m08", "m12", "w00", "w04", "w08", "w12")
cons <- outer(X,X,FUN="paste2")
contrasts <- upperTriangle(cons,FALSE)
#mutant on mutant
MMc <- contrasts[grepl("m\\d*-m\\d*",contrasts)]
#mutant on wild
MWc <- contrasts[grepl("m\\d*-w\\d*",contrasts)]
#wild on wild
WWc <- contrasts[grepl("w\\d*-w\\d*",contrasts)]
To generate the reduced set of contrasts I will use:
genContrasts <- function(cons){
tp <- unique(sub("(\\D*)\\d*", "\\1", cons))
mat <- list()
for (i in tp)
{ mat[[i]] <- cons[grep(i, cons)]
}
contrasts <- lapply(mat, function(x)paste(x[-length(x)],
x[-1],
sep="-"))
if(length(mat)>1)
{ contrasts[["crossed"]] <- apply(do.call(cbind,mat), 1,
function(x){paste(x,
collapse="-")})
}
return(contrasts)
}
# Example:
X <- c("m00", "m04", "m08", "m12", "w00", "w04", "w08", "w12")
genContrasts(X) # returns a list of three
X <- c("m00", "m04", "m08", "m12")
genContrasts(X) # returns a list of one