Entering edit mode
16 months ago
Manuel
▴
10
After applying clustering over a gene mutation matrix with clust:
hclust_result <- hclust(dist(gene_matrix), method = "ward.D2")
dend = as.dendrogram(hclust_result)
I wanted observed in the heatmap that the first cluster had alterations in gene1 while the other two did not have, cluster 2 same for a gene 2 and cluster 3 for gene 3.
How can I given the clusters understand what are the features that best define the cluster? Is there any function in R that gathers the feature importance of hclust?
Is "feature importance" a thing or just something you/your lab made up as an internal term? I don't know if automated evaluation of features is possible outside of a Machine Learning context.
It is the same feature importance defined in the machine learning "world" being applied in the biological "world"
OK but your post does not mention machine learning at all. Why would one think of machine learning in the context of hierarchical clustering?
I understand your point. Yet hclust is a method of unsupervised learning which is a field of machine learning. Will add the tag.
I get what you're saying but it sounds like lowering the bar to me - regression is a simple technique that's still called machine learning for some reason but now even hierarchical clustering with zero predictive use is machine learning? What's next to be included in the machine learning umbrella, correlation?
To clarify, I'm not saying clustering should not be included in the umbrella, I'm just wondering where it stops.