I'm trying to cluster this genetic data but even with multiple different methods k-means/medoids spectral, etc... I can't seem to resolve the two big clusters. Any suggestions? I would really like to be able to identify that central cluster around y=1. The goal is to run this on multiple datasets.
What is x and y? What kind of data is that? Did you try a graph-based clustering based on these two dimensions? So building a KNN/SNN graph first and then cluster that with igraph (e.g. louvain)?
I like your suggestions, I will try that. It's gene coverage at specific coordinates.