Cluster annotation in single cell
1
0
Entering edit mode
15 months ago
synat.keam ▴ 100

Dear Fellows,

In Single cell, once we perform a clustering, for example, "umap", which generate X number of clusters. Next is to perform annotation for cluster, which can be done by looking at differentially expressed genes within each cluster. if we get DEG within each cluster, are these DEGs the result of multiple cell comparison? I remember in bulk-RNA seq, we can only do two groups at a time using contrast? Not sure how do they compared to get DEG among several hundred cells in a cluster for single cell experiment?

Also, with integration of large dataset, the main purpose is batch correction etc. In the end, we get a single umap plot, which is the result of integration of all number of samples and conditions (control/treatment etc) from all groups. Does the display of a single "umap" mean that these cell clusters are found across samples and conditions? How could I know from a single umap that this/that group has less, for instance, fibroblast or T cell given I have cluster with with fibroblast or T cells etc. What is the point of displaying a single umap of all data set (I normally see this in publication)? Sorry I am just very confused... Looking to hear from you all.

Thanks,

Single-cell • 1.8k views
ADD COMMENT
2
Entering edit mode

You need to do through these tutorials which will help you a lot.

Single-cell best practices

OSCA

ADD REPLY
1
Entering edit mode

I'd really recommend finding a local scRNA-seq expert to talk to at your institution if available. These questions are really beyond the scope of this site and will require lengthy and detailed answers.

ADD REPLY
0
Entering edit mode

Are you using Seurat?

ADD REPLY
0
Entering edit mode

Thanks, I'm using seurat and also tried to learn from Bioconductor book. could you help explain me. I am just very confused and did not progress at all

Regards,

ADD REPLY
0
Entering edit mode

I asked that question because it's relevant to your post. I cannot guide you on such a broad topic. Use the links bk11 has provided you to learn more.

ADD REPLY
2
Entering edit mode
14 months ago
e.r.zakiev ▴ 230

Clustering (for example in Seurat's pipeline) is usually done based on PCA embedding, not UMAP, as the former conserves the euclidian distances between the cells in the multidimensional expression space and the latter is somewhat stochastic by definition.

The DEGs can be found with a very nice package called presto and as an added benefit it doesn't assume any distribution of your data as it uses nonparametric (i.e. rank-based) statistical testing.

ADD COMMENT

Login before adding your answer.

Traffic: 1717 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6