I have RMA (Robust Multi-Array) scores for the different genes (and their isoforms) on the Affymetrix chip. I want to know which of these genes are "active" (or in other words: are likely to produce enough protein products to have an effect). I'm not interested in them being differentially expressed or X-fold over- or under-expressed. All I want is the classification of them being likely "on" or "off".
So far I log-transformed (basis 10) the RMA score and centered them (subtracted the median). I called all genes which had a transformed score <0 as being inactive and scores >0 as being active.
Does anyone have a better methodology ?
I think it would help to elaborate on what the "produce enough protein to have an effect" means.
Sorry, I was to vague here. I am looking at the effects of a certain set of transcription factors in a certain tissue. There seem to be some interesting patterns of co-operation between them. Whether these TFs are able to interact in the first place depends on whether all of them are actually expressed in this tissue. That's what I want to find out with this exercise. -- Thanks for your help !