Entering edit mode
4.2 years ago
annadv
•
0
Hello All,
I have 3 samples, sequenced with scRNAseq technology (10x Genomics). Each sample represents a different experimental group. I am using Seurat package for the analysis and I am wondering whether I am supposed to first perform SCTransform on each of them separately and then merge them, or am I supposed to merge them and then to perform SCTransform on the merged object?
Thank you very much for your help.
Regards, Anna
I would say that depends on the analysis goal. Can you explain what you plan to do with the data? Are these samples biological replicates? Are they similar or very different from each other?
Thank you for your reply!
These samples are not biological replicates - each sample is RNA collected (pooled) from a different treatment group from the same experiment, and we are planning to define various cell populations and compare gene expression across these groups within different cell populations.
Sorry I do not understand. What do you mean by pooled?
From what I know about these samples - each sample contains cells from several animals.
Can you list what Seurat objects you have now, and what is in those objects? That might make it easier to understand the experimental setup.
I have 3 h5 files:
Originally I was planning the following steps:
My question was whether I should perform the following steps instead:
However, I found that I based on the study design, I should do integration instead of merging. Can you, please, let me know if this is the right way to work with my data?
The steps I have found are as following:
Regards, Anna
Alright, I understand now. You should do SCTransform with each one separately, and then use integration to combine the separate objects. You can find more information in the 'SCTransform' tab of their integration vignettes.
Thank you very much!
I have a question regarding the integration: based on the description at Satijalab website, the integration based on the "anchors" found between pairs of the datasets being integrated. The authors state that "these represent pairwise correspondences between individual cells (one in each dataset), that we hypothesize originate from the same biological state." Then the authors proceed to show an example describing integration of several datasets containing the same types of cells, but obtained using different sequencing technologies. However, in my case each of the objects represents a different treatment group, so the assumption of the same biological state is incorrect, as far as I understand. Does this mean that I still should integrate the objects using the integration workflow based on these anchors?
Thank you very much!
Hi @annadv. Were you able to find out answer to this? People have been asking this question for a while now with no clear cut recommendations. I also raised a similar query here but still waiting for an answer. What did you end up doing eventually?