Question

Tool for automatic immune cell annotation in MOUSE?

0

Entering edit mode

2 days ago

txema.heredia ▴ 210

Hi,

I am analyzing single cell mouse data of a tissue with a lot of immune infiltration. When focusing on immune cells, clustering with Seurat gave me 5 subtypes using a coarse resolution level, or 8 subclusters using a finer resolution.

I have already manually annotated them into general immune cell types:

> table(droplevels(md[,c("REclustering_res0.5_ct_immune","cell_type_manual_res2_cluster")]))
cell_type_manual_res2_cluster
                   Immune - 3 Immune - 10 Immune - 11 Immune - 15 Immune - 16
Macrophages-0          0         355           0           0           0
Macrophages-3          0         266           0           0           0
Macrophages-6          0           0           0           0          64
Macrophages-7         57           1           0           0           0
B cells-2              0           0           0           0         305
T cells-1              0           0           0         319           0
T cells-4              0           0           0         196           0
T cells-5              0           0         163           0           0

This is a very rough analysis and I need to do a better classification and annotation of these.

I've found several automated immune cell annotation tools, but they seem to work only on human data, and not mouse. Tools like sc-ImmuCC and ImmunIC. Other similar tools like ImmuCellAI-mouse showed potential, but they only work on immune cell abundance for bulk, not for single cell.

What are my options to annotate these MOUSE immune cells?

immune single-cell annotation • 256 views

ADD COMMENT • link updated 14 hours ago by theHumanBorch ▴ 260 • written 2 days ago by txema.heredia ▴ 210

2

Entering edit mode

When annotating, I like to combine canonical markers with automated annotation systems - the go to for mouse immune populations I use is

1) SingleR with the ImmGen database in the CellDex bioconductor package
2) Azimuth with the Pan Sci Mouse data set

These two methods are orthogonal (SingleR is using correlations and Azimuth is mapping the single-cells to a reference atlas). If you are looking at bone marrow immune populations, I have found Haemopedia RNA-seq to be particularly useful with singleR. For both the Pan Sci Mouse and Haemopedia data, you will have to convert the Ensembl gene to actual gene symbols, here is my code to do it:

haemopedia.reference <- read.delim("./data/references/Haemopedia-Mouse-RNASeq_raw.txt")

#Need to Replace Ensembl Gene IDs:
mart <- useMart("ensembl", dataset = "mmusculus_gene_ensembl")  # Change species if needed
gene_map <- getBM(attributes = c("ensembl_gene_id", "external_gene_name"), 
                   filters = "ensembl_gene_id",
                   values = haemopedia.reference$geneId,
                   mart = mart)
haemopedia.reference$geneId <- gene_map$external_gene_name[match(haemopedia.reference$geneId, gene_map$ensembl_gene_id)]

# Remove NA or empty symbols
haemopedia.reference <- haemopedia.reference[!is.na(haemopedia.reference$geneId), , drop = FALSE]
haemopedia.reference <- haemopedia.reference[haemopedia.reference$geneId != "", , drop = FALSE]

# Sum duplicates by gene symbol
summed_matrix <- as.data.frame(haemopedia.reference) %>%
  group_by(geneId) %>%
  summarise(across(everything(), sum, na.rm = TRUE)) 

#Form Matrix
gene.id <- summed_matrix$geneId
summed_matrix <- as.matrix(summed_matrix[,-1])
rownames(summed_matrix) <- gene.id

haemopedia.reference <- SingleCellExperiment( assays = list(counts = summed_matrix))
haemopedia.reference <- logNormCounts(haemopedia.reference)
haemopedia.reference$cell.type <- stringr::str_split(colnames(haemopedia.reference), "[.]", simplify = TRUE)[,1]

ADD REPLY • link 1 day ago by theHumanBorch ▴ 260

0

Entering edit mode

Thanks!

I've already used the SingleR method with some success, but I'm stuck trying Azimuth.

I have downloaded the Pansci mouse rds object, but R is crashing while doing the ScaleData / RunPCA/UMAP / SCTransform steps before it can be turned into an Azimuth reference. I have subseted the original object into keeping only immune cells, but even after using Seurat's SketchData to use only 25k cells, R is crashing in my 64Gb RAM computer.

How did you generate the Azimuth object?

ADD REPLY • link 18 hours ago by txema.heredia ▴ 210

0

Entering edit mode

I struggled with memory issue to on my laptop - this is the code that worked for me that I ran on my workstatsion

reference_data <- readRDS("/references/pansci_filtered.rds")
reference_data <- subset(reference_data, Lineage == "Immune")
reference_data <- UpdateSeuratObject(reference_data)
reference_data <- ConvertGeneNames(reference_data, homologs$Gene.name.mouse, "homologs.rds")

reference_data <- SCTransform(reference_data, conserve.memory=TRUE, return.only.var.genes = FALSE) # SCTransform to take care of NormalizeData, FindVariableFeatures, and ScaleData
reference_data <- RunPCA(reference_data, npcs = 50) # RunPCA on the SCT residuals
reference_data <- RunUMAP(reference_data, dims=1:50, return.model = TRUE) # Build UMAP model
reference_data <- FindNeighbors(reference_data, dims=1:50) # Find neighbors

#Need to add the drop of levels after subsetting 
reference_data$Sub_cell_type_organ <- droplevels(reference_data$Sub_cell_type_organ)
reference_data$Organ_name <- droplevels(reference_data$Organ_name)
reference_data$Main_cell_type <- droplevels(reference_data$Main_cell_type)
reference_data$Sub_cell_type <- droplevels(reference_data$Sub_cell_type)
reference_data$Lineage <- droplevels(reference_data$Lineage)

# Format object
azimuth.obj <- AzimuthReference(
  object = reference_data,
  refUMAP = "umap",
  refDR = "pca",
  refAssay = "SCT",
  metadata = c("Organ_name", "Main_cell_type", "Sub_cell_type", "Sub_cell_type_organ", "Lineage"),
  dims = 1:50,
  reference.version = "2.0.0"
)

ValidateAzimuthReference(azimuth.obj) #Validate proper reference creation
SaveAzimuthReference(object = azimuth.obj, folder = "/references/pansci_immune/")

ADD REPLY • link 14 hours ago by theHumanBorch ▴ 260

0

Entering edit mode

Look at the gene markers driving each cluster to see if you can identify specific cell population. You can also fetch from online databases or create yourself, lists of genes specific to each immune population you are looking for and generate a module score of each list for each cell using AddModuleScore from Seurat. Another option, would be to find published dataset with a wide range of annotated immune clusters in mouse and label transfer their annotation to your dataset to annotate your cluster.

ADD REPLY • link 1 day ago by Bastien Hervé 6.1k