Question

Normalized data values very different from unnormalized counts in Seurat

0

Entering edit mode

13 months ago

AHerik ▴ 20

I the sceasy R package to convert Burclaff et al.'s (2022) single-cell data (GSE185224) from scanpy H5AD data to a Seurat R object. My object's UMAP looks similar to the authors, and I subsetted out "colon" samples. I then normalized and scaled (default parameters) my subsetted data, but the normalized-data looks very different from the counts. I would like advice on how to proceed. For context, the authors stated that

"After filtering, read counts were logtransformed and normalized to the median read depth of donor 2, which had the fewest read counts"

I have attached screenshots of the Normalized-data vs counts FeaturePlot for one gene.

Thank you, Aydin

Counts Normalized

scRNA Single-cell Seurat • 680 views

ADD COMMENT • link 13 months ago by AHerik ▴ 20

0

Entering edit mode

I would suggest you to proceed with GSE185224_Donor1_filtered_feature_bc_matrix.h5, GSE185224_Donor2_filtered_feature_bc_matrix.h5 and GSE185224_Donor3_filtered_feature_bc_matrix.h5 rather than GSE185224_clustered_annotated_adata_k10_lr0.92_v1.7.h5ad.gz. It is likely that you may have double normalized the data. Better off starting from raw data.

ADD REPLY • link 13 months ago by bk11 ★ 3.0k

0

Entering edit mode

Thanks for the suggestion! I actually did this, but another issue I faced is trying to demultiplex the samples. Does this look correct to you?

    # Load the data for Donor1, Donor2, and Dono3
Donor1.data <- Read10X_h5(filename = "donor1/GSE185224_Donor1_filtered_feature_bc_matrix.h5", unique.features = TRUE)
Donor1 <- Seurat::CreateSeuratObject(counts = Donor1.data$`Gene Expression`, min.cells = 3, min.features = 200)

Donor2.data <- Read10X_h5(filename = "donor2/GSE185224_Donor2_filtered_feature_bc_matrix.h5", unique.features = TRUE)
Donor2 <- Seurat::CreateSeuratObject(counts = Donor2.data$`Gene Expression`, min.cells = 3, min.features = 200)

Donor3.data <- Read10X_h5(filename = "donor3/GSE185224_Donor3_filtered_feature_bc_matrix.h5", unique.features = TRUE)
Donor3 <- Seurat::CreateSeuratObject(counts = Donor3.data$`Gene Expression`, min.cells = 3, min.features = 200)

# Add HTO data as a new assay independent from RNA
Donor1[["HTO"]] <- CreateAssayObject(Donor1.data[["Antibody Capture"]][, colnames(x = Donor1)])
Donor2[["HTO"]] <- CreateAssayObject(Donor2.data[["Antibody Capture"]][, colnames(x = Donor2)])
Donor3[["HTO"]] <- CreateAssayObject(Donor3.data[["Antibody Capture"]][, colnames(x = Donor3)])

###########

Donor1[["percent.mt"]] <- PercentageFeatureSet(Donor1, pattern = "^MT-")
Donor1 <- subset(Donor1, subset = nFeature_RNA > 500 & nCount_RNA > 3000 & nCount_RNA < 50000 & percent.mt < 75)

Donor2[["percent.mt"]] <- PercentageFeatureSet(Donor2, pattern = "^MT-")
Donor2 <- subset(Donor2, subset = nFeature_RNA > 800 & nCount_RNA > 1000 & nCount_RNA < 30000 & percent.mt < 50)

Donor3[["percent.mt"]] <- PercentageFeatureSet(Donor3, pattern = "^MT-")
Donor3 <- subset(Donor3, subset = nFeature_RNA > 500 & nCount_RNA > 3000 & nCount_RNA < 50000 & percent.mt < 75)

# Performing demulitplexing

# If you have a very large dataset we suggest using k_function = 'clara'. This is a k-medoid
# clustering function for large applications You can also play with additional parameters (see
# documentation for HTODemux()) to adjust the threshold for classification Here we are using
# the default settings

Donor1 <- NormalizeData(Donor1, assay = "HTO", normalization.method = "CLR")
Donor1 <- HTODemux(Donor1, assay = "HTO", positive.quantile = 0.99)
Donor1 <- subset(Donor1, idents = "Negative", invert = TRUE)

Donor2 <- NormalizeData(Donor2, assay = "HTO", normalization.method = "CLR")
Donor2 <- HTODemux(Donor2, assay = "HTO", positive.quantile = 0.99)
Donor2 <- subset(Donor2, idents = "Negative", invert = TRUE)

Donor3 <- NormalizeData(Donor3, assay = "HTO", normalization.method = "CLR")
Donor3 <- HTODemux(Donor3, assay = "HTO", positive.quantile = 0.99)
Donor3 <- subset(Donor3, idents = "Negative", invert = TRUE)

ADD REPLY • link 13 months ago by AHerik ▴ 20