Question

Is it Necessary to Use Raw Data for SCTransform When Regressing Out Cell Cycle Genes?

0

Entering edit mode

11 months ago

Kimaya • 0

I want to regress out cell cycle genes and would need normalized data to get the scores first. And then use SCTransform to regress out cell cycle genes.

From what I understand SCTransform has its normalization, scaling, and finding variable features in the same command. So, shouldn't we use the raw data as input for SCTransform while regressing out cell cycle genes instead of normalized data? Otherwise, won't SCTransform normalize the already normalized data (twice?)??

In short, my question is exactly like this one: Seurat CellCycleScoring – confused about the proper order of operations when using SCTransform. I think he has put it in better words than me

Do you have any insights on this?

PS: Normalizing the raw data to get cell cycle scores, and then using the raw data object for SCTransform doesn't make sense, because the raw data (non-normalised) will not have the columns needed for vars.to.regress in SCTransform

SCTransform • 1.9k views

ADD COMMENT • link 11 months ago by Kimaya • 0

score 1 · Accepted Answer · 2024-08-28

1

Entering edit mode

11 months ago

LChart 5.0k

SCTransform variance stabilization and normalization is based on a (regularized) negative binomial likelihood - so it really does expect counts; so for that analysis you should go back to the raw data.

Just as, under "old" Seurat this meant running ScaleData twice (once prior to scoring, once after scoring), the same goes for SCTransform-based regression.

ADD COMMENT • link 11 months ago by LChart 5.0k

0

Entering edit mode

Thanks for your prompt reply. Theoretically using raw counts seemed like the right thing to do to me as well. But the raw metadata doesnt have the S/G2M scores columns that are used as vars.to.regress. How to tackle that ?

ADD REPLY • link 11 months ago by Kimaya • 0

0

Entering edit mode

You simply need to add the new columns to the metadata and re-call SCTransform. Note that SCTransform.Seurat calls into SCTransform.Assay (https://github.com/satijalab/seurat/blob/1549dcb3075eaeac01c925c4b4bb73c73450fc50/R/preprocessing.R#L3761), which in turn uses the @counts slot of the apropriate assay (https://github.com/satijalab/seurat/blob/1549dcb3075eaeac01c925c4b4bb73c73450fc50/R/preprocessing.R#L3655), so SCTransform will always attempt to use raw count data, if it's present in the object.