Entering edit mode
4 months ago
sp
•
0
Hello,
I am new to scRNASeq analysis. All of this is in R, and all functions were run in default.
I am trying to use fastMNN
only till data integration for a comparative study with other tools. I am following a very standard workflow that I found through the example codes in documentation:
- Read in the matrix as a
sce
- Normalised it with
logNormCounts()
- Performed feature selection using
modelGeneVar()
- Selected top n hvgs with
getTopHVGs()
- Performed PCA and UMAP on the sce, used
runPCA()
andrunUMAP()
. This info I believe is stored in "PCA" and "UMAP" of the sce. - Visualised the UMAP using
plotReducedDim()
, which I believe to be the same as the likes ofplotUMAP()
, exceptdimred
is a requirement (which I set to"UMAP"
). - Performed data integration using
fastMNN()
after subsetting using the chosen hvgs. - Again ran PCA and UMAP on sce_integrated, except now
dimred
HAS to equal"corrected'
. I don't understand this. - Plotted the UMAP for sce_integrated for comparison with before integration, and again used
plotReducedDim()
. I was not sure if dimred should equal"UMAP"
or"corrected"
, since I believe the embeddings are stored in"corrected"
, so shouldn't"corrected"
be used for visualisation as well? However when I plotdimred="UMAP"
, the UMAP is different from the UMAP earlier, which means the embeddings get overwritten?
Summary of doubts:
- I don't understand why PCA and UMAP need to be run twice, before and after integration.
- Why is
dimred="corrected"
needed for runPCA after data integration? (earliersce <- runPCA(sce, ncomponents = 50)
worked). - For plotting UMAPs should
dimred="corrected"
be used after data integration? - Do UMAP embeddings get overwritten in sce if UMAP is run again after data integration?
Thanks for all your help~ Sorry about the long post, I wanted to provide as much context as possible.
Thank you so much! This clarified everything for me and I was wondering if there was a way to not overwrite the embeddings so your answer was super helpful!!