resolving specific clusters from data (spectral clustering)
1
0
Entering edit mode
2.4 years ago

I'm trying to cluster this genetic data but even with multiple different methods k-means/medoids spectral, etc... I can't seem to resolve the two big clusters. Any suggestions? I would really like to be able to identify that central cluster around y=1. The goal is to run this on multiple datasets. enter image description here

clustering • 791 views
ADD COMMENT
1
Entering edit mode

What is x and y? What kind of data is that? Did you try a graph-based clustering based on these two dimensions? So building a KNN/SNN graph first and then cluster that with igraph (e.g. louvain)?

ADD REPLY
0
Entering edit mode

I like your suggestions, I will try that. It's gene coverage at specific coordinates.

ADD REPLY
0
Entering edit mode
2.4 years ago
Mensur Dlakic ★ 28k

I think Gaussian mixture models will work well on this type of scatter, though you will likely end up with more than two clusters. If you provide [X, Y] coordinates for data points, I could tell you for sure.

You could literally plug in your data into a script below instead of random points they generate:

https://scikit-learn.org/stable/auto_examples/mixture/plot_gmm.html

ADD COMMENT

Login before adding your answer.

Traffic: 2577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6