Single-Cell RNA Sequencing: Find Tissue with Highest Sox1 Expression
0
0
Entering edit mode
6 weeks ago
Koki • 0

Hello,

I am looking for advice on the general workflow for finding the highest Sox1 tissue in mouse spinal cord across timepoints e9.5 - 11.5 using the single-cell data from the following paper: https://journals.biologists.com/dev/article/146/12/dev173807/19473/Single-cell-transcriptomics-reveals-spatial-and

Here is my approach for a single time point: I have managed to extract e9.5 mouse cells and assign the tissue metadata to the Seurat object as shown below. I've also removed all cells labelled 'NULL' or 'Outliers'. Since there are 20 annotated tissue types, I've clustered the dataset with TSNE using a resolution of 1 such that 20 clusters form. Further graphing shows clusters 4 and 12 express the highest Sox1.

Next, I ran: cluster12.markers <- FindMarkers(se9.5, ident.1 = 12) to identify differentially expressed genes for that cluster.

I have three problems:

  1. The authors have 20 tissue types and I have 20 clusters. How can I determine how closely the tissue annotation and clusters correspond? (The authors have not provided a list of marker genes, so my only option seems to be a literature search)

  2. This type of analysis isn't exactly what I am looking for. Ideally, I want a list showing the relative expression levels of Sox1 for each cluster, ranking from 1-20.

  3. No mitochondrial genes are detected for pre-filtering. All features are in the gene symbol format.

Any advice would be appreciated! Thank you.

e9.5 Analysis:

library(tidyverse)
library(dplyr)
library(Seurat)

se9.5 <- readRDS("se9.5.rds")
pheno <- read_tsv("phenoData_annotated.tsv")

se9.5[["percent.mt"]] <- PercentageFeatureSet(se9.5, pattern = "^Mt-")
VlnPlot(se9.5, features = "percent.mt")

e9.5 <- pheno[pheno$timepoint == 9.5, c("...1", "Type_step1")]
se9.5@meta.data$Step1 <- e9.5$Type_step1

se9.5 <- subset(se9.5, subset = !(Step1 == "NULL" | Step1 == "Outliers"))

se9.5 <- NormalizeData(se9.5, normalization.method = "LogNormalize", scale.factor = 10000)
se9.5 <- FindVariableFeatures(se9.5, selection.method = "vst", nfeatures = 2000)
all.genes <- rownames(se9.5)
se9.5 <- ScaleData(se9.5, features = all.genes)
se9.5 <- RunPCA(se9.5, features = VariableFeatures(object = se9.5))
se9.5 <- FindNeighbors(se9.5, dims = 1:10)

se9.5 <- FindClusters(se9.5, resolution = 1)
se9.5 <- RunUMAP(se9.5, dims = 1:10)
DimPlot(se9.5, reduction = "umap")


VlnPlot(se9.5, features = c("Sox1"))

clusters <- unique(se9.5@meta.data$Step1)

cluster12.markers <- FindMarkers(se9.5, ident.1 = 12)

enter image description here

RNA-Seq • 234 views
ADD COMMENT

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6