I've heard a lot of people discussing UMAP recently as though it has essentially superseded t-SNE for visualizing scRNA-seq data. UMAP is certainly impressive, but it seems to me that there are a lot of things one can do to pretty dramatically improve the output of t-SNE - for example, perplexity annealing, or PCA initialization followed by merging two perplexities (all of which are described here https://www.biorxiv.org/content/10.1101/453449v2, for example). All of the comparisons that I have seen between UMAP and t-SNE compare UMAP to t-SNE alone (e.g. https://www.nature.com/articles/nbt.4314.pdf), without these "tricks" that can improve the t-SNE plots. This feels a little like a strawman to me; has anyone done any work or seen any studies comparing UMAP to t-SNE for scRNA-seq data visualization with these improvements?
Part of the issue with t-SNE is that you get different results each run, it doesn't scale well, and the "rigorous" improvements you mention require extra setup or aren't supported in most packages. If it's shown to be a real improvement, it will likely be adopted in time as people become more aware of it (as was/is the case for UMAP). Convenience often reigns supreme.
Hi rtrende, what package do you use to run UMAP?
thanks
I've been running UMAP using Seurat, which uses the python umap-learn package
There is also the umap package in R (on CRAN).
The Bioconductor package
scater
offers convenience functions for both t-SNE and UMAP.