Is it Necessary to Use Raw Data for SCTransform When Regressing Out Cell Cycle Genes?
1
0
Entering edit mode
4 months ago
Kimaya • 0

I want to regress out cell cycle genes and would need normalized data to get the scores first. And then use SCTransform to regress out cell cycle genes.

From what I understand SCTransform has its normalization, scaling, and finding variable features in the same command. So, shouldn't we use the raw data as input for SCTransform while regressing out cell cycle genes instead of normalized data? Otherwise, won't SCTransform normalize the already normalized data (twice?)??

In short, my question is exactly like this one: Seurat CellCycleScoring – confused about the proper order of operations when using SCTransform. I think he has put it in better words than me

Do you have any insights on this?

PS: Normalizing the raw data to get cell cycle scores, and then using the raw data object for SCTransform doesn't make sense, because the raw data (non-normalised) will not have the columns needed for vars.to.regress in SCTransform

SCTransform • 747 views
ADD COMMENT
1
Entering edit mode
4 months ago
LChart 4.7k

SCTransform variance stabilization and normalization is based on a (regularized) negative binomial likelihood - so it really does expect counts; so for that analysis you should go back to the raw data.

Just as, under "old" Seurat this meant running ScaleData twice (once prior to scoring, once after scoring), the same goes for SCTransform-based regression.

ADD COMMENT
0
Entering edit mode

Thanks for your prompt reply. Theoretically using raw counts seemed like the right thing to do to me as well. But the raw metadata doesnt have the S/G2M scores columns that are used as vars.to.regress. How to tackle that ?

ADD REPLY
0
Entering edit mode

You simply need to add the new columns to the metadata and re-call SCTransform. Note that SCTransform.Seurat calls into SCTransform.Assay (https://github.com/satijalab/seurat/blob/1549dcb3075eaeac01c925c4b4bb73c73450fc50/R/preprocessing.R#L3761), which in turn uses the @counts slot of the apropriate assay (https://github.com/satijalab/seurat/blob/1549dcb3075eaeac01c925c4b4bb73c73450fc50/R/preprocessing.R#L3655), so SCTransform will always attempt to use raw count data, if it's present in the object.

ADD REPLY
0
Entering edit mode

Thanks a lot for the clarification :)

ADD REPLY

Login before adding your answer.

Traffic: 4084 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6