The dataset has 1000 genes and contains 24 samples with two mouse strains tested (129 and B6) and six brain regions. There are two replicates for each region.
The ANOVA was performed as follows:
sdata<-read.table("http://www.chibi.ubc.ca/wp-content/uploads/2013/02/
sandberg-sampledata.txt", header=T, row.names=1)
strain <- gl(2,12,24, label=c("129","bl6"))
region <- gl(6,2,24, label=c("ag", "cb", "cx", "ec", "hp", "mb"))
# define ANOVA function ## ANOVA
aof <- function(x) {
m<-data.frame(strain,region, x);
anova(aov(x ~ strain + region + strain*region, m))
}
# apply analysis to the data and get the pvalues.
anovaresults <- apply(sdata, 1, aof)
pvalues<-data.frame( lapply(anovaresults, function(x) { x["Pr(>F)"][1:3,] }) )
# Get the genes with good region effect pvalues.
reg.hi.p <-t(data.frame(pvalues[2, pvalues[2,] < 0.0001 & pvalues[3,] > 0.1]))
reg.hi.pdata <- sdata[ row.names(reg.hi.p), ]
A significant p-value resulting from a 1-way ANOVA test would indicate that a gene is differentially expressed in at least one of the groups analyzed. Now that there are more than two groups being analyzed, however, the 1-way ANOVA does not specifically indicate which pair of groups exhibits statistical differences. I know that Post Hoc tests can be applied in this specific situation to determine which specific pair/pairs are differentially expressed in each of the regions ( irrespective of the strains). I would like to know how to apply the Tukey's HSD using R in this case to find out which of these genes ( the ones with good region effect pvalues) are expressed in which region ( for instance in which brain region like "ag","cb","cx and so on).
I tried doing it this way but generates the following error:
mcps <- TukeyHSD(anovaresults, ordered=TRUE)
Error in UseMethod("TukeyHSD") : no applicable method for 'TukeyHSD' applied to an object of class "list"
I noticed that it is a list object but am unsure how to create a anova fit now.
Just edited the answer to make it more clear (and point out these are not "one way ANOVAs")