Question about datareduction for gower distancing and unsupervised clustering
0
0
Entering edit mode
2.9 years ago

Hi everyone!

I have a data set for which the wish is to do unsupervised clustering. The consist of mixed data, and hence I chose to go with gower-distance for a dissimilarity matrix. However, I have som concerns with the data.

I have some data for localization of a specific measurement, and the physicians have measured these in 6 different places with "yes"/"no" outcomes. I am wondering if having one column representing all possible combinations would be the right way to go, or to keep everything as asymmetric binaries for a gower distance matrix? I would later go with PAM analysis on the distance matrix to find potential clusters.

I have tried to merge the different binaries into a new variable as factors e.g. Test, Test2, Test3 into New_variable:

    Test   Test2  Test3  New_variable
 A  0      1      1      011
 B  1      1      1      111
 C  0      0      0      000

So the question is as to run the analysis using each variable or merge them into a factor? My guess is that it would answer different questions?

Cheers and thanks in advance!

Partitioning clustering Medoids distance R unsupervised reduction Around data Gower • 392 views
ADD COMMENT

Login before adding your answer.

Traffic: 2533 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6