Is There Any Cluster Validation Package In R That Supports Overlapping Clusters?
1
2
Entering edit mode
13.5 years ago
Sirus ▴ 820

Hello every body, I am doing some clustering method that generates overlapping clusters and I want to calculate some clustering validation measures of my results and the results of other existing algorithms. I've tried some functions such as cluster.stats but unfortunately it doesn't support overlapping. Could any one point me to an other package to do it? if there is not, is there a way to by-pass it (except writing the code by self :) ) ? Thank you in advance.

clustering r • 6.2k views
ADD COMMENT
4
Entering edit mode
13.5 years ago
Michael 55k

Most cluster statistics are made for hard cluster assignment and cannot be easily transfered to fuzzy or model-based clustering, think of Rand-Index for example. Thus, other measures are needed, unless you turn the soft clustering methods into a hard method by assigning to a single cluster by a cut-off. The packages that contain fuzzy clustering methods will possibly contain tailored methods to evaluate cluster quality. I know e.g. the Mclust package computes the BIC. The cluster package contains the fanny fuzzy clustering method, that package also contains some (not so widely applied methods). So look into the packages containing soft clustering methods, or resort to hard clustering.

ADD COMMENT
0
Entering edit mode

Thank you Micheal for this help, I have checked this clustering pakages [?]clusterinf pakages[?] and no one seems to support soft clustering, the only whay so it write my own script so.

ADD REPLY
0
Entering edit mode

Thank you Micheal for this help, I have checked this [?]clustering pakages[?] and no one seems to support soft clustering, the only whay so it write my own script so

ADD REPLY
0
Entering edit mode

Look at the 'model-based' section. A mixture model based clustering is a 'soft clustering' approach. Another example is the the method 'fanny' (fuzzy clustering) method in the cluster package. But it is true that there are very few cluster indices that take soft cluster assignments into account, and it is hard to compare different methods based on these statistics.

ADD REPLY
0
Entering edit mode

Another option is a Mfuzz package available from bioconductor. It performs fuzzy clustering, allows to check validity of fuzzification parameter and number of clusters chosen. It produces also overlap.plot which plot two first principal components and show relationship between clusters. Info here.

ADD REPLY

Login before adding your answer.

Traffic: 1432 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6