Hierarchical clustering in r
1
0
Entering edit mode
2.6 years ago

Hi Guys,

Can anyone suggest me an efficient way to cluster patients based on the variants called ? I have performed Hierarchical clustering by converting data into matrix of 1s and 0s. The problem which I am facing is that the HC method clusters the samples with matching 1 and zero closer. But, I just want to focus on matching 1s and not zeros.

In this case, how to avoid counting places in samples with matching zeros ? Or is there any other efficient algorithm to do the same ?

Please suggest. Thanks in advance.

cluster • 890 views
ADD COMMENT
0
Entering edit mode
2.6 years ago
Mensur Dlakic ★ 28k

You can convert your matrix into a sparse format, which essentially deletes zeros and has an empty space for them instead. However, I don't think that will change your clustering solution. What you are talking about as matching only ones but not zeros is not how clustering happens. The absence of signal (missing zeros) will also become a part of the clustering pattern.

ADD COMMENT
0
Entering edit mode

I also agree with "The absence of signal (missing zeros) will also become a part of the clustering pattern", but that got me wondering: Could we give more weights to 1s than 0s? If that's possible, wouldn't there be a situation where we give weight of zero (w=0) to 0s, essentially removing their "clustering pattern"? Of course, this assumes we are able to give differential weights to each class (0 vs. 1), which may be impossible to begin with.

ADD REPLY
0
Entering edit mode

Even if adding different weights were possible, I don't see any justification for it.

If the pattern is dominated by zeros, meaning many columns with zeros and few with ones, I would delete all invariant columns and then cluster. Removing invariant columns is a legitimate way to boost the signal, because there is no signal in columns where all values are identical. This would still retain the true signal without any artifact introduced by weighing.

ADD REPLY

Login before adding your answer.

Traffic: 2288 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6