Question

Using SCTransform with Seurat for multi-sample RNA-seq data

2

Entering edit mode

6.1 years ago

steveh ▴ 70

Hi all,

I've been using Seurat for multi-sample RNA-Seq data as described in this tutorial:

https://satijalab.org/seurat/v3.0/immune_alignment.html

i.e. creating subsets for each sample then performing integration of the subsets.

Elsewhere in the Seurat docs though SCTransform is described and recommended instead of using the usual NormalizeData, ScaleData, and FindVariableFeatures functions.

When following the subset & integrate model though, NormalizeData and FindVariableFeatures are performed only on the subsets, whilst ScaleData is applied to the integrated data, so I'm wondering if SCTransform is compatible with multi-sample data or if I should just stick to using the three functions (NormalizeData, ScaleData, and FindVariableFeatures) individually?

Many thanks,

Steve

RNA-Seq R seurat SCTransform • 21k views

ADD COMMENT • link updated 6.1 years ago by igor 13k • written 6.1 years ago by steveh ▴ 70

score 2 · Answer 1 · 2019-07-06

2

Entering edit mode

6.1 years ago

igor 13k

There is a thread about Seurat integration using SCTransform values here: https://github.com/ChristophH/sctransform/issues/4

the absolute best thing to do in our view would be to correct the Pearson residuals themselves (which are stored in the @scale.data slot of the SCT assay). In this case, you would not run ScaleData after integration, as the corrected residuals would already be placed in the @scale.data slot of the 'corrected' assay. This is not currently implemented in the public version of v3, but will be soon.

Update: there is now a SCTransform Integration vignette available.

ADD COMMENT • link 6.1 years ago by igor 13k

1

Entering edit mode

I don't fully understand why one couldn't do the integration on the Pearson residuals; with the recent release they're being returned (as "corrected counts", I believe), so I'd assume one could use them in the recommended way?

ADD REPLY • link 6.1 years ago by Friederike 9.0k

2

Entering edit mode

Looks like there is a new PrepSCTIntegration() function: https://github.com/satijalab/seurat/blob/30f0df6b979cb61df0f093ce8eea06c1caebd024/R/integration.R#L1139-L1257

ADD REPLY • link 6.1 years ago by igor 13k

1

Entering edit mode

I had the same concern, but haven't had time to look into it in more depth to see what I am missing. At this point, I am assuming the "official" protocol will be posted at any moment.

ADD REPLY • link 6.1 years ago by igor 13k

0

Entering edit mode

thanks Igor and Friederike, your replies are very helpful.

Just to clarify, my samples are all from the same batch, but the vignettes you point to are still applicable, particularly the one you provided here Igor. It seems to me that whilst it's possible to use SCTransform in this context, it's not currently obvious or intuitive how to do it - for example the question of whether or not to run ScaleData following integration. I think I'll just use the three functions individually for now, until the developers have completed their vignette on combining sctransform with Seurat v3 integration.

ADD REPLY • link 6.1 years ago by steveh ▴ 70

1

Entering edit mode

Why do you think you need the integration step? If there's no obvious batch effect, I would just run SCTransform and call it a day.

ADD REPLY • link 6.1 years ago by Friederike 9.0k

0

Entering edit mode

yes, I think you're right, I don't need to do that. What I'm really interested in is being able produce various plots which are either grouped or split by sample, and I was following the steps in that tutorial because it seemed to show how to do that. However in their case their two samples are from different batches, hence the separation of the NormalizeData, FindVariableFeatures and ScaleData steps.

In my case, I think all I need to do is use the AddMetaData function to label cells differently on the whole dataset, then I can as you say just apply SCTransform to the whole thing.

Many thanks!

ADD REPLY • link 6.1 years ago by steveh ▴ 70

0

Entering edit mode

In my case, I think all I need to do is use the AddMetaData function to label cells differently on the whole dataset, then I can as you say just apply SCTransform to the whole thing.

Yes, I'm fairly confident that that should work!

ADD REPLY • link 6.1 years ago by Friederike 9.0k

score 1 · Answer 2 · 2019-07-05

1

Entering edit mode

6.1 years ago

Friederike 9.0k

You can add a/multiple batch variable(s) to the scTransform command as shown here.

ADD COMMENT • link 6.1 years ago by Friederike 9.0k

1

Entering edit mode

But just to clarify, according to the developer, "using the batch indicator variable in sctransform::vst does not replace an integration analysis as implemented in Seurat" (https://github.com/ChristophH/sctransform/issues/3)