I have data of spatial 10X visium transcriptome. I are working on different conditions with two replicate each (A1, A2, B1,B2) . All samples were submitted in spaceranger pipeline along with "image" data . Now I have output from spaceranger ( feature matrix and so on..). Next, I followed given steps :
merge_A1_A2 datasets in Seurat -> SCTransform -> RunPCA -> FindClusters -> RunUMAP . Identified clusters.
merge_B1_ B2 datasets in Seurat -> SCTransform -> RunPCA -> FindClusters -> RunUMAP . Identified clusters
But I have read some post where researchers merged data from multiple condition merge_A1_B1_C1_D1 -> SCTransform -> RunPCA -> FindClusters -> RunUMAP .
Which approach would be more suitable to highlight the difference among A and B conditions ?
To study the batch effect : I did the separate analysis of
x) A1 -> SCTransform -> RunPCA -> FindClusters -> RunUMAP . Identified clusters.
A2 -> SCTransform -> RunPCA -> FindClusters -> RunUMAP . Identified clusters.
y) merge_A1_A2 -> SCTransform -> RunPCA -> FindClusters -> RunUMAP . Identified clusters.
I found difference in clusters obtained from X and Y approach. I am planning to continue with y (merge_A1 _A2 clusters) . Would it be right ?
I will appreciate all the suggestions.
Are the replicates biological or technical?
technical replicate
With multiple samples you generally want to follow an integration workflow, such as Seurat, batchelor, or scVI. Integration helps to avoid separation of cells in UMAP manifolds and clustering based on batch.
I wouldn't necessarily base your comparison of different samples on differences in UMAP composition, since UMAPs are not guaranteed to preserve local or global structure. Furthermore, you have no biological replicates so accurate quantification of differentially expressed genes and/or composition is going to be somewhat dodgy for the time being.
For now I think your best bet is to define broad cell types based on clustering in the PCA space followed by non-pseudo bulk differential expression (such as wilcoxon rank-sum test) between the two conditions. If the results look promising consider collecting biological replicates for more proper and advanced analysis.