Difficult to know what was the calculation under the hood to plot the heatmap, the methods are not super detailed.
What I would do, is to select a number of gene markers (20,50,100, what you feel make the best distinction between your clusters) for each cluster (with FindAllMarkers
in Seurat for example)
Then you create a matrix of normalized expression of gene markers per cluster by aggregating your counts for each clusters.
You scale your matrix by row and column, and you can call correlation between clusters and inverse it to get distances.
gene_markers <- FindAllMarkers(object, slot = "data", min.pct = 0.1, logfc.threshold = 0.5, only.pos = TRUE)
topDiffCluster <- gene_markers %>% filter(p_val_adj < 0.05) %>% filter(avg_log2FC > 0.5) %>% filter(pct.1 > 0.5) %>% arrange(cluster, desc(avg_log2FC)) %>% group_by(cluster) %>% top_n(n = 20, wt = avg_log2FC)
expression_matrix <- as.data.frame(AggregateExpression(object, assays = "RNA", group.by = "seurat_clusters", normalization.method = "LogNormalize", scale.factor = 10000, return.seurat = TRUE)[["RNA"]]$data[unique(topDiffCluster$gene),])
colnames(expression_matrix) <- gsub("g","",colnames(expression_matrix))
expression_matrix_scaled <- t(scale(t(expression_matrix)))
d=as.dist(1-cor(expression_matrix_scaled))
corrplot(d, method = 'shade', order = 'hclust')