Hi,
I have scRNA-Seq data and I am using Seurat v.3.2.2 to analyze it. I have three samples, one from SHAM and another two from UUO operated Kidneys. I am looking for an injury response. In my analysis, I am performing integration and I am using SCT normalization only for the Anchors detection (FindIntegrationAnchors
) and later for the integration, PCA, UMAP, clustering, etc. I am using log normalized and scaled data.
I am using this code to perform analysis.
integration.features <- SelectIntegrationFeatures(object.list = reference.list, nfeatures = 3000)
prep.sct.integration <- PrepSCTIntegration(object.list = reference.list, anchor.features = integration.features)
integrate.anchors <- FindIntegrationAnchors(object.list = prep.sct.integration, anchor.features = integration.features, normalization.method = "SCT")
integrated.data <- IntegrateData(anchorset = integrate.anchors) # default normalization.method = "LogNormalize"
integrated.data <- ScaleData(object = integrated.data, verbose = T)
integrated.data <- RunPCA(object = integrated.data, verbose = T)
integrated.data <- RunUMAP(integrated.data, dims = 1:30, verbose = T)
integrated.data <- FindNeighbors(object = integrated.data, dims = 1:30, verbose = T)
integrated.data <- FindClusters(integrated.data, resolution = 0.5, verbose = T)
When I am using linear log normalization for both anchors detection and integration, there is a clear batch effect. I mean in UMAP I can see sample-specific populations located far from each other (my UUO samples are far from the SHAM). Whereas, when I am using SCT normalization in both the cases (FindIntegrationAnchors
and IntegrateData
), I guess, I am losing the actual effect introduced by the injury and the downstream analysis is a bit unclear. Therefore, I am using SCT normalization only for the anchors prediction and linear log normalization for the downstream analysis and I am getting results not too stringent or not too lenient in terms of removing sample-specific effects.
So I would like to have your advice and I wanted to know whether such a hybrid approach is acceptable in this case or not?
Any suggestions, discussion, or help would be greatly appreciated.
Thanks in advance :)
If you need more clarification on this please feel free to ask.
Regards,
Nitin N.
Cross-posted: https://github.com/satijalab/seurat/discussions/5795
Yes, I first posted this question on GitHub, on the Seurat discussion page but I did not get any reply there so I posted it here, hoping I would get some relevant answers/suggestions.
Thanks,
Nitin N.