How to cluster microbial samples after principal coordinate analysis based on beta-diversity
1
0
Entering edit mode
5 months ago

Hi!

I am working on the analysis of microbial data (relative abundances) coming from a large cohort. I have a matrix of Bray-Curtis distances of our samples, and I've ran PCOA on the distance matrix. Our samples seem to cluster into two main cluster, interestingly this clustering doesn't seem to be led by any metadata feature we are aware of. Our plot looks like this enter image description here

I would like to define the two clusters using k-means (or some other clustering method, k-means just seems to be popular for this kind of analysis), draw the cluster borders on the plot and get a list for the samples belonging to each cluster. Currently I feel a bit stuck at this step. Is there an R or Python package for such analysis that would make my work easier?

Thanks in advance!

clustering bray-curtis microbiome pcoa • 391 views
ADD COMMENT
2
Entering edit mode
5 months ago
Mensur Dlakic ★ 28k

Scikit-learn package in python has many clustering methods that would work out-of-box on your data:

https://scikit-learn.org/stable/modules/clustering.html

In addition to k-means where you specify the number of clusters, you can let the algorithm decide on the most optimal cluster number. I recommend these two:

ADD COMMENT

Login before adding your answer.

Traffic: 1980 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6