When analysing single cell RNA-seq data one of the many steps is the clustering step, that allow to (very bluntly) put together groups of similar cells, according to a pattern (a gene expression one). A nice description is found here. On the other hand, pre-compiled datasets of known cell signatures exist (e.g. ImmGen) that allow the labelling of each cell.
If this is true...what is the meaning of clustering, in a single cell RNA-seq analysis pipeline? I feel it is mainly used to retrieve unknown cell types that are not labelled by, e.g., ImmGen but even so...when I use ImmGen to label my cells there is a lot of information that is missing plus some cell types are very difficult to cluster together so how can we trust the goodness of any clustering? What is its actual use?
PS: I couldn't find a satisfactory answer to my (silly?) question. Any link, thought, or info will be helpful.
My two cents: I think differentiating unknown, novel, and/or deviating from "normal" (whatever that is) cells, seem to be important.