I have a dataset (9X7 matrix) and I need to cluster the dataset into any number of clusters (3 is preferable) using K-means clustering with the range of respective clusters. The dataset looks like the following picture.
Range of the respective cluster means:
Cluster 1 from 0.05 to 0.08
Cluster 2 from 0.08 to 0.09
NOTE: The numbers are taken from the attached figure. And this is purely a bioinformatics problem. Because clustering algorithm is fully useful for filtering the large genomic dataset.
I tried doing k-means clustering using weka, but not able get the range of the respective clusters. The clusters results using weka is
Clustered Instances
0 1 ( 14%)
1 2 ( 29%)
2 4 ( 57%)
Could anyone please help me to get the cluster range for the clustering algorithm?
If such algorithm doesnot exits, is there any option to plot the clusters on a graph after clustering?
what is "the range of the respective clusters"? Unless you can give a formal definition your question doesn't make much sense, also it is seemingly unrelated to bioinformatics.
So you mean the range is the range of all cluster member vectors combined? just extract the original vectors for each cluster and compute the
range
over all of their single values, in R that is trivial, see?kmeans
for details on how to do kmeans in R. But how does it constitute a sensible filtering step at all for a lrge genomic dataset? I guess it is better you tell us what you are really trying to accomplish, then we can tell you if your method is sane.Btw, I am not giving a complete answer because I think this method makes no sense.
Thanks I have got the idea to do clustering in R. Thanks for the above comment.
The idea is to cluster the classes and then check the range...