A beginner's (i.e. wet-bench trained person) question of microarray gene expression analysis -
I downloaded and extracted the "expression values" (or should I call them "signal intensity" ?) of some microarray experiments from GEO access. As I would only be looking at a handful of genes, NOT the whole genome expression, is there a free software, or a free web-based interface that allows me to plug in the "raw values" and tell me if they are statistically different or similar ? Or what would be the "correct" way of doing analyses on these raw values if no easy interface exists to expedite my analysis?
Example
probe HGNC ctrl1 ctrl2 ctrl3 ctrl4 ss1 ss2 ss3 ss4
232546_at TP73 4.76775 3.36975 4.89104 2.96765 4.15112 3.07037 4.52393 4.71372
From what I've read, it seems that I cannot DIRECTLY compare these two sets of value using the simple average(medium value) and t-test. It seems that I need to normalize these values first. Do I need to ? Which control probes should I look for to normalize the gene values? Any direction/solution will be greatly appreciated. My apology if a similar question has been raised in the past (apparently, I am so new to array analyses that I do not know how to ask/find a question).
You cannot look at only "a handful" of genes. Differential expression analysis begins with everything on the array, then tells you what (if anything) is significant.
If I understand the general premise of differential expression analyses, I think when the expression values of the genes of interest are too low, their expression could be lumped together - statistically speaking - as "no differential expression", especially if there are other genes on the array that are GROSSLY differentially expressed. In addition, I am under the impression that even if several genes in the same biological signaling/metabolic pathway are individually marginally differentially expressed, collectively they may impact the OUTPUT of the signaling/metabolic pathway significantly different. Is this correct?
Point 1: no. Point 2: yes.
You'll see, when you play with GEO2R, that the starting point for calculations is a matrix of probes (rows) v samples (columns). You don't discard or select anything before starting; just work through the process with all probes and see what comes out the other end.
And yes, there are additive affects in cells which frankly, we don't understand or model very well right now.