Entering edit mode
4.8 years ago
623202215
▴
90
Hi,
I have 3 patients with normal and tumor tissue sample(10× technology), there are six samples in total. I want combine them and find whether there is some difference between normal and tumor sample in specific cell type (such as immune cell). Due to different patients, it may have batch effect. I am struggle to choose integrate method.
My question is that:
- For the Seurat integrate method, SCtransform method, harmony method, which one is more suitable for my case, or I just merge six sample together directly?
- If use Seurat integrate pipeline and SCtransform pipeline, they recommand to use "RNA" assay instead of "integrate" assay to do Findmarker, for my understanding, "RNA" assay is batch-uncorrect matrix, will it cause some problem? Or there are other better options to do it?
Thanks a lot, hope to get your suggestions!
Best,
Wei
Hi jared,
Thanks very much, it's very helpful. I will try it and see what's the "best" solution. In addition, I met several technicial problems when I perform seurat integrate pipeline, I would be appreciated if you can make some commnet.
For the seurat's integrate pipeline, they normalize each sample individually -> findanchor -> integrate -> scaledata, and the scaledata was saved in integrate assay, there is no scaledata in RNA assay. If I swich into RNA assay to perform FindAllMarker and want to plot them in a heatmap, how can I acheive it? Should I scaledata again in RNA assay or use integrate assay's scaledata? Or should I scaledata before integrate?
Another question is about regression, I notice that someone will choose regress unwanted signal such as mito.percent, UMI during scaledata. I really want to know how does seurat do when performing regression, would you like to discribe it more specifically. I am not very clearly about what's the change before and after regression. Sorry for this naive question, I am trying to understand more about seurat.
Thanks again, hope to get your suggestion!
Best, Wei
I have also ran into this. I generally scale data only for the RNA assay after integration so that the heatmap works properly. I've tried scaling before integration (but after merging all samples into a single Seurat object), but I think it gets removed for some reason during integration, if I remember correctly.
The regression is essentially removing differences between cells that are due to differences in a given variable(s). I recommend reading the Seurat papers or asking on their github if you want a better explanation. The papers go into much more detail, and a few different questions have been asked on their github issues page that you can find with a bit of searching.
Hi jared,
Thanks for your tiemly help, I will explore the related parper and github. Thanks again!
Best, Wei
Hi jared,
Thanks for your reply. I get a one more question, if you don't mind, I hope you can give me some suggestions. Sorry to bother you again!
Best, Wei
Yeah, the way forward for doing that isn't clear. In my eyes, there are two options. The first is your current approach. The second is to as fine-tuned clustering as you think you'll need with all your samples (increasing resolution to 1.5 or more), taking your subset, and just rolling with that. My guess is that your current approach will better serve your purpose.
Unfortunately, there's no clear answer, so you might have to experiment a bit to determine what yields the best results for your data.
Hi jared,
Thanks very much, I learn a lot from your answer. I agree with you, subset and run integrate pipeline again maybe better.
I also test the dimension of FindIntegrationAnchors() and IntegrateData(), the one is use from the CCA to specify the neighbor search space, the another one is number of PCs to use in the weighting procedure. When I set different number, the result can differ a lot. I know there is no clear answer but depends on our purpose. I wonder if you are familiar with these two parameter, from computatinal correct, whether these two dimension should be same, I mean both of them are 25, 30, 35? My intuition is that these two parameter is not very relevant. I am appreciate if you can discuss these problem.
Thanks
Best, Wei
I've not used CCA (and I've seen the Seurat folks recommend the new integration method over it), so I'm not sure on that one.
I will explore it, thanks a lot for patient and timely help!