I have a .CEL file from a genechip human genome u133 plus 2.0 array. I would like to calculate expression fold change for all of the genes represented therein. Typically I work with RNA-Seq, and for an experiment like this I would have 2 samples, A and B, and at least 3 replicates of each. I would then build a count matrix, use the replicates to estimate dispersion, and then generate fold changes between the samples.
However, in this case I only have a single sample and want to determine the relative expression of each gene. Since I need some benchmark to which I can calculate fold changes for the genes (something to be fold change = 1), I am assuming I should use some subset of known housekeeping genes. However, I've never done this before and would like some advice from those more experienced.
Can someone please explain to me what needs to be done to go from a .CEL file to a list of fold change values for all genes in my sample, as determined by comparison to a housekeeping-determined baseline? Ideally, I would like to replicate this on samples in the future, and since their housekeeping baseline should be comparable with this first sample, hopefully I could make inferences on the DGE between the samples.
Thanks for any suggestions!
Is there perhaps some reference for the genechip human genome u133 plus 2.0 array? For example, if a very vanilla prep of human cells were run on this chip and I can access that data, I'm thinking I could then compare my sample to that vanilla reference and determine which genes in my sample have fold changes relative to a normal human sample
So I'm guess I'm asking if a very good human control is available for this chip, so that I can use my sample as the experimental condition and calculate DGE from that. Specifically, I'm looking for a normal gastric mucosa tissue reference.