I have just written an R package called clusterCons that is an implementation of the method for clustering robustness assessment described in:-
S. Monti. Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning, 52, July 2003.
Which follows something very similar to the re-sampling approach very nicely described by PhiS (above). The package is in alpha at the moment, but is being used by quite a few other groups right now as part of our functionality testing, we are just finishing a paper describing the package and its application to cluster and gene prioritisation. The method described by Monti is very simple and elegant and has been cited in >100 papers to date. The clusterCons package can use any kind of clustering provided that the clustering function used returns a result that can be formatted as a cluster membership list, so could be used with supervised clustering (all you have to do is write a small very simple custom wrapper for any new functions), but is currently written to use the methods provided by the 'cluster' package in R which are all unsupervised (so you can currently use 'agnes', 'pam', 'hclust', diana' and 'kmeans' out of the box). If you are interested in trying it you can get it through CRAN or sourceforge and I am very happy to help you on your way if you decide to try it out.
Unfortunately, although I understand where he is coming from I actually disagree with Michael and the review by Allison, and I am both a biologist and a computational biologist. I think that supervised clustering is often used with a belief that it is going to provide biologically meaningful clusters when in fact it produces clusters that are heavily biased by the supervising information. The problem with this is that the biological supervising information is almost always of very poor and often unverifiable quality. It could be for example transcription factor binding data (hugely biased, data sparse and noisy), ChIP-ChIP/ChIP-seq (very high noise), patient classification (unquantified, obtuse, unverifiable or just plain wrong) to name but a few. I'm not saying that supervised clustering doesn't have a place, but I avoid it like the plague and much prefer to follow up unsupervised clustering (with robustness measures) with some proper biological validation. I hope this doesn't come across as antagonistic to Michael, clearly he has a lot of experience with clustering as well, just wanted to let you know how we handle the problem.
+1 for a very precise and well formatted question.
... thank you !