Hello,
I am looking for advice on the general workflow for finding the highest Sox1 tissue in mouse spinal cord across timepoints e9.5 - 11.5 using the single-cell data from the following paper: https://journals.biologists.com/dev/article/146/12/dev173807/19473/Single-cell-transcriptomics-reveals-spatial-and
Here is my approach for a single time point: I have managed to extract e9.5 mouse cells and assign the tissue metadata to the Seurat object as shown below. I've also removed all cells labelled 'NULL' or 'Outliers'. Since there are 20 annotated tissue types, I've clustered the dataset with TSNE using a resolution of 1 such that 20 clusters form. Further graphing shows clusters 4 and 12 express the highest Sox1.
Next, I ran: cluster12.markers <- FindMarkers(se9.5, ident.1 = 12) to identify differentially expressed genes for that cluster.
I have three problems:
The authors have 20 tissue types and I have 20 clusters. How can I determine how closely the tissue annotation and clusters correspond? (The authors have not provided a list of marker genes, so my only option seems to be a literature search)
This type of analysis isn't exactly what I am looking for. Ideally, I want a list showing the relative expression levels of Sox1 for each cluster, ranking from 1-20.
No mitochondrial genes are detected for pre-filtering. All features are in the gene symbol format.
Any advice would be appreciated! Thank you.
e9.5 Analysis:
library(tidyverse)
library(dplyr)
library(Seurat)
se9.5 <- readRDS("se9.5.rds")
pheno <- read_tsv("phenoData_annotated.tsv")
se9.5[["percent.mt"]] <- PercentageFeatureSet(se9.5, pattern = "^Mt-")
VlnPlot(se9.5, features = "percent.mt")
e9.5 <- pheno[pheno$timepoint == 9.5, c("...1", "Type_step1")]
se9.5@meta.data$Step1 <- e9.5$Type_step1
se9.5 <- subset(se9.5, subset = !(Step1 == "NULL" | Step1 == "Outliers"))
se9.5 <- NormalizeData(se9.5, normalization.method = "LogNormalize", scale.factor = 10000)
se9.5 <- FindVariableFeatures(se9.5, selection.method = "vst", nfeatures = 2000)
all.genes <- rownames(se9.5)
se9.5 <- ScaleData(se9.5, features = all.genes)
se9.5 <- RunPCA(se9.5, features = VariableFeatures(object = se9.5))
se9.5 <- FindNeighbors(se9.5, dims = 1:10)
se9.5 <- FindClusters(se9.5, resolution = 1)
se9.5 <- RunUMAP(se9.5, dims = 1:10)
DimPlot(se9.5, reduction = "umap")
VlnPlot(se9.5, features = c("Sox1"))
clusters <- unique(se9.5@meta.data$Step1)
cluster12.markers <- FindMarkers(se9.5, ident.1 = 12)