pearson correlation or Euclidean distance for clustering?
1
1
Entering edit mode
4.2 years ago
mrashad ▴ 80

I have a matrix of multi omics expression and need to make a clustering using Hierarchical clustering and k means but confused between the used distance Euclidean distance or Pearson correlation.

Is there any guide for which one of them should be used in expression data?

gene-expression • 8.0k views
ADD COMMENT
5
Entering edit mode
4.2 years ago

There is neither a guide nor standard for this.

If using either of Euclidean distance or Pearson correlation, your data should follow a Gaussian / normal (parametric) distribution. So, if coming from a microarray, anything from RMA normalisation is fine, whereas, if coming from RNA-seq, any data deriving from a transformed normalised count metric should be fine, such as variance-stabilised, regularised log, or log CPM expression levels.

If you are performing clustering on non-normal data, like 'normalised' [non-transformed] RNA-seq counts, FPKM expression units, etc., then use Spearman correlation (non-parametric).

As usual, get intimate with your data, know its distribution, and thereafter choose the appropriate method(s).

Kevin

ADD COMMENT
3
Entering edit mode

A good point to rise is data distribution importance for choosing distance measures in clustering analysis. Thanks This is my understanding of differences between Euclidean distance or Pearson correlation distances application for gene expression clustering: When we are interested in considering overall expression profiles (up and down), correlation-based measures (i.e. Pearson correlation) would be of choice. In other cases, we may want to cluster observations with the same magnitude of dysregulation together. In this way observations with high value of features would cluster together. In these cases, Euclidean distance would be our choice for dissimilarity matrix calculation.

ADD REPLY
0
Entering edit mode

Thanks a lot I got it

ADD REPLY
1
Entering edit mode

I got it, thanks a lot for this fruitful answer.

ADD REPLY

Login before adding your answer.

Traffic: 2012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6