Hi everyone,
I'm trying to do an ANOVA on normalized microarray data. My input file looks like this:
Probe_ID control.CEL sample1.CEL sample2.CEL sample3.CEL
AB112960 2.273088453 2.475903455 2.094961932 2.331040798
AF004856 2.466043833 2.809889315 2.257952048 2.492190838
I have replicates for each (control, sample1, sample2, sample3). I have run ANOVA in r using this script:
aovdata <- read.csv("microarray.csv", header = TRUE)
aovdata1 = aovdata[,2:13]
row.names(aovdata1)<-aovdata$Probe_ID
tissue<-gl(4,3,labels=c("control","aoi","ap","pp"))
aof <- function(x)
{m<-data.frame(tissue, x);
anova(aov(x ~ tissue))}
anovaresults <- apply(aovdata1, 1,aof)
str(anovaresults)
After performing ANOVA, I'm having trouble extracting the Probe Ids along with their p-value. How can I just extract the probe Ids and the corresponding p-value from ANOVA test? When I view summary(anovaresults), it gives me some different kind of information than str(anovaresults) ?
Thanks a lot!!!
Is there a reason you're not using limma? That would be the more standard way to analyze microarray data and results in more convenient data.frames. Aside from that, anova returns an object, which is being coerced into a vector by as.vector. The apply command then returns a matrix of those vectors. You might just run the anova on a single probe and compare those results before and after sending them through as.vector(). That should give you an idea of how everything is formatted. Personally, I would store the anova results in something and then return some sort of meaningfully formatted output from it. Then, you would have better control over things.
limma is definitively done for that. It also adds some very useful functionalities like FDR adjustment which might be relevant to use as you run multiple tests. http://www.statsci.org/smyth/pubs/limma-biocbook-reprint.pdf