My lab is interested in analyzing some old microarray data. I performed RMA (w/ Quantile Normalization) on the raw .CEL files using the affy package in R.
The goal is to make a bar plot/heatmap/etc of a few genes showing differences in gene expression from timepoint 1 to timepoint 2, and to report the fold change in gene expression.
Because of this, on top of the RMA step, my PI is interested in doing a further normalization step to account for differences in library size/RNA input to each different chip. He proposed normalizing by GAPDH (i.e. treating GAPDH as a 'loading control' in this microarray) but I think that will be problematic, given the variable nature of GAPDH expression.
Can anyone suggest alternatives? Is it kosher to divide each gene's expression value by the total of all expression values, when each value is representative of fluorescence intensity rather than gene fragments? Should we divide all genes by the geometric mean of several purported housekeeping genes?
Thank you for your help!