Entering edit mode
3.3 years ago
mk
▴
300
How can I run umap on a seurat object, and specify the features (genes) to use for the initial PCA reduction?
I'm looking for something like what the following [hypothetical] syntax would achieve:
data("pbmc_small")
pbmc_small
# Run UMAP map on first 5 PCs
pbmc_small <- RunUMAP(object = pbmc_small, dims = 1:5, reduction="pca", reduction features= c("CD79A", "MS4A1, TCL1A", "HLA-DQA1", "HLA-DQB1",...))
Where the [non-existent] option "reduction features" specifies the features to use for RunUMAP(...,reduction="pca")
RunUMAP
has an argumentfeatures
where you can specify the features to run PCA on.I think this actually bypasses dimred and uses the features for embedding, which is what I'm trying to avoid:
If you are planning to select a relatively small number of features - say, less than 50 - you could skip the
PCA
and let UMAP work with them directly.Separately,
PCA
already takes care of uninformative features by selecting first the eigenvectors that maximize the variance. It would be less biased if you go with PCA rather than hand-picking the features.yeah for these 'special' umap embeddings the feature subset is more like 1-2k genes, so I think it's still optimal to dimred prior to embedding.
agree dimred usually obviates marker selection for embedding, issue is that i know beforehand that some markers in the original object actually code for certain confounding factors that are only of biological interest for a subset of the visualizations i need to generate
in the end i just created a whole new seuarat object with only the features i want for that particular plot, not ideal imho since im going to end up with a bunch of seurat objects all containing the same cells, or nearly the same cells