Question

How does FindNeighbors() and FindClusters() related and work?

1

Entering edit mode

22 months ago

leranwangcs ▴ 150

Hi,

I'm trying to figure out how each step of Seurat work. But I don't have a deep math background so I have trouble to understand what exactly does FindNeighbors() do and what does FindClusters() do. Any one can provide a simplified explanation on this please? Like after FindNeighbors(), what scores will be given, which will be used in find clusters in what way?

Thanks so much! Leran

seurat clustering • 4.3k views

ADD COMMENT • link updated 22 months ago by bk11 ★ 3.1k • written 22 months ago by leranwangcs ▴ 150

5

Entering edit mode

FindNeighbors() and FindClusters() are commonly used methods in data analysis and machine learning, particularly in the context of unsupervised learning and clustering of single cell data.

FindNeighbors(): FindNeighbors() is a function that is used to find the nearest neighbors of your single cell data point within a dataset. It works by calculating the neighborhood overlap (Jaccard index) between every cell and its k. param nearest neighbors. It's often employed in various applications such as anomaly detection, and dimensionality reduction. The concept is that given a data point, you want to identify the closest data points to it based on some similarity metric, such as Euclidean distance or cosine similarity. This helps to identify similar points in the dataset, which can be useful for making predictions or understanding the distribution of the data.
FindClusters(): FindClusters() is a function used for clustering data points into groups or clusters based on their similarity. It uses a graph-based clustering approach and a Louvain algorithm. Clustering is an unsupervised learning technique where the algorithm groups similar cells together without any predefined labels. The goal is to find patterns and structure in your data. The number of clusters and the algorithm used can vary based on the problem and data characteristics. Common clustering algorithms include K-means, hierarchical clustering, and DBSCAN.

Relationship and Working:

FindNeighbors() and FindClusters() can be used in conjunction for various single cell data analysis work.
In some clustering algorithms, the concept of finding neighbors is used as a fundamental step in determining which data points are close to each other, thereby forming clusters.
When performing clustering using FindClusters(), FindNeighbors() is usually used to establish the similarity between data points, which can guide the clustering algorithm's decisions.
Finding neighbors can also be useful for evaluating the quality of clusters produced by FindClusters(), as well as for identifying potential outliers or anomalies.

In summary, while FindNeighbors() focuses on finding the nearest neighbors of a single data point, FindClusters() deals with grouping multiple data points into clusters based on their similarities. Both methods complement each other in various scenarios to cluster the data points.

ADD REPLY • link 22 months ago by bk11 ★ 3.1k