I just wanted to confirm, to ascertain the differentially expressed genes across a (non-integrated) SCTransformed Seurat dataset (using FindAllMarkers), we need to use the "RNA" assay post normalisation and scaling, and not the "SCT" assay? Whlst there are github and biostars posts (Github, Biostars_1, Biostars_3) suggesting this is the case, the SCTransform vignette suggests otherwise:
You can use the corrected log-normalized counts for differential expression and integration. However, in principle, it would be most optimal to perform these calculations directly on the residuals (stored in the scale.data slot) themselves. This is not currently supported in Seurat v3, but will be soon.
If you are using sctransform_v1 then you do not need to use the SCT slot for DE analysis. Recently, a new version was released sctransform_v2, which allows you to run DE on the SCT assay.
This update improves speed and memory consumption, the stability of parameter estimates, the identification of variable features, and the the ability to perform downstream differential expression analyses.
To use the new feature you need to set the right "flavor" during transformation (SCTransform(object, vst.flavor = "v2")).
I don't know how the two methods compare, as I haven't tested the the new flavor yet, but I am sure you will find additional info in the issues section of the git repo.
Ok that is super helpful, ill most likely use this it allows for reference to one set of data and should obfuscate any issues / concerns with controlling for other covariates. Thanks!
As far as I know, this is still kind of true - running differential expression analysis on SCTranscform gives weird results. It's also hard to understand whether the result make sense, since the scale for the residuals is so off. For typical RNA assay after regular normalization, you get log2 values that are easy to interpret - e.g. expression of 1 is low, and expression of 6 is very high, in a typical 10x experiment. For SCTransform it's very hard to tell if your marker even makes sense.
Ok that is super helpful, ill most likely use this it allows for reference to one set of data and should obfuscate any issues / concerns with controlling for other covariates. Thanks!