Question

How To Cluster A Dataset With The Respective Cluster Range?

0

Entering edit mode

11.4 years ago

grosy ▴ 100

I have a dataset (9X7 matrix) and I need to cluster the dataset into any number of clusters (3 is preferable) using K-means clustering with the range of respective clusters. The dataset looks like the following picture.

Range of the respective cluster means:

Cluster 1 from 0.05 to 0.08 
Cluster 2 from 0.08 to 0.09

NOTE: The numbers are taken from the attached figure. And this is purely a bioinformatics problem. Because clustering algorithm is fully useful for filtering the large genomic dataset.

I tried doing k-means clustering using weka, but not able get the range of the respective clusters. The clusters results using weka is

 Clustered Instances

0      1 ( 14%)  
1      2 ( 29%)  
2      4 ( 57%)

enter image description here

Could anyone please help me to get the cluster range for the clustering algorithm?

If such algorithm doesnot exits, is there any option to plot the clusters on a graph after clustering?

clustering r • 5.6k views

ADD COMMENT • link 11.4 years ago by grosy ▴ 100

0

Entering edit mode

what is "the range of the respective clusters"? Unless you can give a formal definition your question doesn't make much sense, also it is seemingly unrelated to bioinformatics.

ADD REPLY • link 11.4 years ago by Michael 56k

3

Entering edit mode

So you mean the range is the range of all cluster member vectors combined? just extract the original vectors for each cluster and compute the range over all of their single values, in R that is trivial, see ?kmeans for details on how to do kmeans in R. But how does it constitute a sensible filtering step at all for a lrge genomic dataset? I guess it is better you tell us what you are really trying to accomplish, then we can tell you if your method is sane.

ADD REPLY • link 11.4 years ago by Michael 56k

0

Entering edit mode

Btw, I am not giving a complete answer because I think this method makes no sense.