Hi all, I'm a little bit confused about gene expression phylogenies and the structure of hierarchical clustering. I got two different structures from hierarchical clustering (Pearson correlation coefficient, "complete" hclust method from pheatmap package) and gene expression phylogeny(Neighbour-joining trees based on pairwise distance matrices(1-r, r is Spearman's correlation coefficient)) using the same expression data, and the details are in the attachment. And the topology from hierarchical clustering seemed more consistent with species tree. What does anybody think about that? Why does the different topologies happen to this? Which one is better to explain phylogeny relatedness?
hierarchical clustering: https://ibb.co/kmxDik
gene expression phylogeny: https://ibb.co/cmZn9Q
Any help will be highly appreciated.
Yan
Different methods produce different outputs. There's nothing surprising here.
Complete linkage = furthest neighbour joining
Spearman's correlation != Pearson's correlation
EDIT: To clarify, the two approaches described are both hierarchical clustering using different distance measures and different linkage.
I don't see where there's any phylogeny involved here.
Thank you so much, Jean-Karim. I'm sorry, I forgot the species tree. https://ibb.co/k9yxuQ I see the methods between spearman's correlation and pearson's correlation are different. But the very huge different topology, I can't understand, because they are both correlation after all. PS: I don't find spearman's correlation in pheatmap package in R. Which method is more suitable to analysis expression phylogeny, what do you think?