Question

scRNA seq dissimilarity plot

0

Entering edit mode

9 months ago

carolofharvest ▴ 50

Hi,

I want to make a heat map showing dissimilarities between clusters in single cell data. I retrieved this image from this article : 10.4049/jimmunol.2200365

In the article they describe how they did it like this : To quantify high-level differences in overall gene expression levels between clusters, Analysis of Similarities (ANOSIM) R statistics were computed in the vegan package in R .

But anosim doesnt give a output that can be used for constructing heat map. it only gives p value and R statistics. I couldnt figure it out.

enter image description here

anosim scRNA-seq dissimilarity • 724 views

ADD COMMENT • link updated 8 months ago by Bastien Hervé 6.4k • written 9 months ago by carolofharvest ▴ 50

score 3 · Answer 1 · 2024-10-30

Difficult to know what was the calculation under the hood to plot the heatmap, the methods are not super detailed.

What I would do, is to select a number of gene markers (20,50,100, what you feel make the best distinction between your clusters) for each cluster (with FindAllMarkers in Seurat for example)

Then you create a matrix of normalized expression of gene markers per cluster by aggregating your counts for each clusters.

You scale your matrix by row and column, and you can call correlation between clusters and inverse it to get distances.

gene_markers <- FindAllMarkers(object, slot = "data", min.pct = 0.1, logfc.threshold = 0.5, only.pos = TRUE)
topDiffCluster <- gene_markers %>% filter(p_val_adj < 0.05) %>% filter(avg_log2FC > 0.5) %>% filter(pct.1 > 0.5) %>% arrange(cluster, desc(avg_log2FC)) %>% group_by(cluster) %>% top_n(n = 20, wt = avg_log2FC)
expression_matrix <- as.data.frame(AggregateExpression(object, assays = "RNA", group.by = "seurat_clusters", normalization.method = "LogNormalize", scale.factor = 10000, return.seurat = TRUE)[["RNA"]]$data[unique(topDiffCluster$gene),])
colnames(expression_matrix) <- gsub("g","",colnames(expression_matrix))
expression_matrix_scaled <- t(scale(t(expression_matrix)))
d=as.dist(1-cor(expression_matrix_scaled))
corrplot(d, method = 'shade', order = 'hclust')