Question

Single cell cluster naming

0

Entering edit mode

7 months ago

sc_confusion • 0

It seems like a lot of single cell papers will name cluster based on "canonical markers". Where they will basically cherry pick a cluster based on the expression of these markers many of which are neuropeptides. This is done even for clusters where there is only a handful of the thousands of cells in a cluster that show sparse to no expression of these markers. I've even seen papers where a different cluster will show higher expression of one of these markers, but they will call the cluster with lower expression the marker. Additionally often times many of these clusters show expression of multiple "markers" not just the one they decide to call the cluster.

Can someone help me make sense of the logic behind this. Is it basically other papers have shown the existence of these cells so they must exist.... Even though we don't have any clusters that show high expression of these marker genes we are just going to assume because the other cells in this cluster share gene expression levels that this cluster it should still be called this? If so, how do we ignore that often times these cluster express many of these markers. Why doesn't anyone ever do rnascope with these markers and some of the top genes that are exclusively expressed in the same cluster to show that these cells actually exist.

Can someone help me make sense of this. Is anyone aware of any white papers, blog posts, or publications from prominent people in the field that discuss the logic behind this and how to think about cluster naming?

single-cell • 640 views

ADD COMMENT • link updated 7 months ago by Ram 45k • written 7 months ago by sc_confusion • 0

score 0 · Answer 1 · 2024-11-27

Until the advent of single-cell sequencing, cell types were defined on the basis of surface proteins - as these could be assessed by staining and CyTOF, and these are the original definitions of "canonical markers." In terms of relating a cell cluster to prior knowledge, using these markers as a basis for cell type assignments make sense. However, it assumes that RNA-defined molecular populations correspond with high fidelity to surface-marker-defined molecular populations -- in the sense that each surface-marker population is discernable by RNA.

In effect, the typical goal of cluster annotation is not to assign a definitive ontology to a cell type, but instead to relate clusters to already-established surface-based cells and their corresponding niche and physiology.

In effect, if a cluster uniquely expresses a surface marker - even at a low level - [or: a known combination of surface markers] then it would be reasonable to assume it would positively stain for that cell type. This assumption has some obvious limitations - in practice I rarely find T cell clusters to align nicely with surface markers, and find CITE-seq necessary to fully resolve surface-based types...