Hi,
I want to cluster gene expression in R using kmeans (or some other function/package) and I would like that the clustering be 'intelligent', in the sens that some within-cluster dissimilarity metric is being minimized, while avoiding over-splitting of clusters.
I have already tried kmeans, but do not want to specify an a-priory number of clusters. Here is the code:
data.xpr = read.table("my_data.txt") # Rows = 250 genes, cols = 32 individuals
clusters = kmeans(x = data.xpr, centers=20)
I am quite aware that there are a few other questions on the subject, but the answers are very broad and none permits to do what I would like to accomplish.
I would very much appreciate to have some code examples for R.
Cheers!
Will, he was looking for cluster analysis not for a stochastic process, -1 for a "random-pick" wikipedia link...