integrate single cell RNA sequencing data
2
0
Entering edit mode
18 months ago
Andy ▴ 120

Good morning,

In my project, the majority of the datasets use the Illumina HiSeq platform to process the data. However, I have come across one dataset that uses the NOVA sequencing platform, and I would like to include it. However, I am concerned about potential integration issues. I was wondering if you could give me some advice?

Best,
Andy

scRNAseq single-cell • 1.3k views
ADD COMMENT
1
Entering edit mode
18 months ago
John Ma ▴ 310

You need to perform batch effect removal for these. Seurat has its integration protocol, and if you use scverse on Python, the usual choice will be the SCVI model of scvi-tools.

ADD COMMENT
1
Entering edit mode
18 months ago

There are different approaches that you may choose to deal with batch correction. Harmony is one of those approaches that has been shown to be effective in batch effect removal. In practice, merge the raw Seurat objects for all samples, then perform subsequent normalization, variable feature selection, and PC calculation on the merged object. Here is an example code snippet using the harmony package to demonstrate this process.

library(Seurat)
library(harmony)
library(dplyr)

# Merge raw objects
merged_obj <- merge(x = obj1,
                    y = c(obj2, obj3, ...),
                    merge.data = TRUE)

# Data processing
merged_obj <- merged_obj %>%
  NormalizeData() %>%
  FindVariableFeatures(selection.method = "vst", nfeatures = 2000, assay = "SCT") %>% 
  ScaleData() %>%
  SCTransform(vars.to.regress = c("mitoRatio"))  # Add any other variables you want to regress out for their effect

# Perform PCA
merged_obj <- RunPCA(merged_obj, assay = "SCT", npcs = 50)

# Run Harmony for batch correction
harmonized_obj <- RunHarmony(merged_obj, 
                             group.by.vars = c("instrument_type", ...),  # Replace with actual column names in your metadata
                             reduction = "pca", assay.use = "SCT", reduction.save = "harmony")

# Generate UMAP using the harmony-generated embedding
harmonized_obj <- RunUMAP(harmonized_obj, reduction = "harmony", assay = "SCT", dims = 1:40)  # Adjust the number of dimensions if needed

# Perform clustering using the harmonized object
harmonized_obj <- FindNeighbors(object = harmonized_obj, reduction = "harmony")
harmonized_obj <- FindClusters(harmonized_obj, resolution = c(0.2, 0.4, 0.6, 0.8, 1.0, 1.2))  # Modify resolution values if necessary
ADD COMMENT

Login before adding your answer.

Traffic: 2521 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6