Hi,
I have been having trouble to understand why some dots are so close together in t-SNE plot but they are assigned to different clusters in FindNeighbors()
and FindClusters()
?
For example below plot:
The most of the cluster 0 (red dots) are in bottom right but there are some are scattered in upper left and seem closer to cluster 3 (purple) and cluster 2 (blue) than to cluster 0.
Then why those dots were still grouped as cluster0?
This is a subset of a larger number of cells. The steps I did are:
# read sample1
sample1 <- Read10X("~/sample1/raw_feature_bc_matrix/")
sample1 <- CreateSeuratObject(counts = sample1 , project = "sample1",min.cells = 3, min.features = 200)
sample1@meta.data$cellID <- names(sample1$orig.ident)
# subset cells to only keep the cells that we want
sample1 <- subset(sample1,cellID %in% sample1_ID_list))
sample1 <- NormalizeData(sample1)
sample1 <- FindVariableFeatures(sample1, selection.method = "vst", nfeatures = 2000)
sample1 <- ScaleData(sample1, verbose = FALSE)
sample1 <- RunPCA(sampl1, npcs = 30, verbose = FALSE)
# read sample2
sample2 <- Read10X("~/sample2/raw_feature_bc_matrix/")
sample2 <- CreateSeuratObject(counts = sample2, project = "sample2",min.cells = 3, min.features = 200)
sample2@meta.data$cellID <- names(sample2$orig.ident)
# subset cells to only keep the cells that we want
sample2 <- subset(sample2,cellID %in% sample2_ID_list))
sample2 <- NormalizeData(sample2)
sample2 <- FindVariableFeatures(sample2, selection.method = "vst", nfeatures = 2000)
sample2 <- ScaleData(sample2, verbose = FALSE)
sample2 <- RunPCA(sampl2, npcs = 30, verbose = FALSE)
# read sample3
sample3 <- Read10X("~/sample3/raw_feature_bc_matrix/")
sample3 <- CreateSeuratObject(counts = sample3, project = "sample3",min.cells = 3, min.features = 200)
sample3@meta.data$cellID <- names(sample3$orig.ident)
# subset cells to only keep the cells that we want
sample3 <- subset(sample3,cellID %in% sample3_ID_list))
sample3 <- NormalizeData(sample3)
sample3 <- FindVariableFeatures(sample3, selection.method = "vst", nfeatures = 2000)
sample3 <- ScaleData(sample3, verbose = FALSE)
sample3 <- RunPCA(sample3, npcs = 30, verbose = FALSE)
# QC step(skipped)
# integrate 3 samples
immune.anchors <- FindIntegrationAnchors(object.list = list(sample1,
sample2,
sample3), dims = 1:20)
combined.3.samples <- IntegrateData(anchorset = immune.anchors,dims = 1:20)
DefaultAssay(combined.3.samples) <- "RNA"
combined.3.samples <- NormalizeData(combined.3.samples, normalization.method = "LogNormalize", scale.factor = 10000)
combined.3.samples <- FindVariableFeatures(combined.3.samples, selection.method = "vst", nfeatures = 2000)
combined.3.samples <- ScaleData(combined.3.samples, verbose = FALSE)
combined.3.samples <- RunPCA(combined.3.samples, npcs = 30, verbose = FALSE)
combined.3.samples <- RunUMAP(combined.3.samples, dims = 1:20,reduction = "pca")
combined.3.samples <- RunTSNE(object = combined.3.samples,reduction = "pca")
combined.3.samples <- FindNeighbors(combined.3.samples, reduction = "pca", dims = 1:20)
combined.3.samples <- FindClusters(combined.3.samples, resolution = 0.5)
# plot
DimPlot(combined.3.samples, reduction = "tsne",group.by = "seurat_clusters",label = TRUE,repel = TRUE)
Thank you!
Leran
Could you please tell us what are your steps prior to tSNE plot and what is your sample (PBMCs or other tissues)? Will you also please tell us if it is subset of clusters from other several clusters? Sample heterogeneity could be one of the factor driving scattered clusters as you have shown here.
Thanks for the suggestion! I have edited my post!
Leran
Your post has
sample1
being read in thesample2
code chunk as well. Is that a copy-paste typo or does your code have that error too?Thanks for pointing that out! I have corrected it.
Leran