Given a microarray data set, multiple probes map to single genes.
Is there an accepted method to reduce these multiple probes to a single representative probe either by selecting a single probe that meets some criteria or by a data reduction excercise like PCA?
For Affy, at least traditional Affy, Wills answer makes most sense. But you can in fact have multiple probes for the same gene on other arrays as well. If you just want a quick solution taking the median is often the best solution. If you have more than two that will make sure you get rid of any real outliers, that can originate from problems like David mentioned in his comment. If you are really interested in one specific gene simply look at the raw data and even the raw image and try to make sense of it. Sometimes one of differing spots is simple in a bad area. Sometimes one of sequences shows cross hybridization problems when you BLAST them all. Sometimes there really is nothing you can see and the sequences turn out to (almost completely overlap), then you are back with the default solution: just take the median.
If you're using Affy arrays then I usually use RMA and use a custom CDF from the Brain Array group. They re-create mappings between probes and genes such that each probe-to-gene mapping is one-to-one. They have CDFs for almost every ID system out there (Unigene, Entrez-Gene, RefSeq, Kegg-Genes, etc).
Particularly on older chips (e.g. Hu133a) this is not the best approach. Some probes have known cross-hybridization issues (check netaffx or download their databases) and often there will be at least one probeset with noisy background expression.
Particularly on older chips (e.g. Hu133a) this is not the best approach. Some probes have known cross-hybridization issues (check netaffx or download their databases) and often there will be at least one probeset with noisy background expression.