Hello,
My lab is interested in doing a meta-analysis of two microarray datasets downloaded from a public repository. In each dataset, a microarray analysis was done on tissue collected at two timepoints. We're interested in asking, across all timepoints, whether genes change expression abundance between timepoint 1 and timepoint 2.
Originally, I had approached this problem with RankProd, and found a few genes of interest that were significantly up/downregulated. A colleague took a different approach: first, background corrected/normalized data was batch effect-adjusted using ComBat. From this ComBat-adjusted expression data matrix, ten genes of interest were isolated. She then did a two-way ANOVA with factors for Gene (i.e., ten possible) and Timepoint (i.e., two possible). With this method, the found significant Gene x Timepoint interaction. With Tukey's post-hoc, she found several genes were significantly up- or down-regulated. Some of these, but not all, were the same genes that had been identified as up or down-regulated with RankProd.
My potential concerns are 1) different numbers of samples in each dataset (Dataset 1= 3 samples at each timepoint, Dataset 2=15 samples at each timepoint), and 2) potential statistical concerns we're not considering by doing an ANOVA on ComBat output.
Is it okay to apply ANOVA to microarray data in this way? If not, why not?
Thank you for your help!
Thank you for the helpful suggestion! To clarify: when you say RankProd can only handle two groups, do you mean two timepoints, or two datasets? I had used three datasets and two timepoints with RankProd at one point, but this seemed O.K. according to the vignette (3 "origins", 2 "conditions"). I do not think I used ComBat before RankProd in that case.
Two timepoints, it can thankfully handle multiple samples per time point :)
Give ComBat a try and then try to run RankProd on the results. I suspect you get more similar results to what your colleague got with the ANOVA. The point here is that you're pretty much guaranteed to have a batch effect when doing a metanalysis. If you don't correct for it then you end up tanking your statistical power.