I have 6 scRNAseq
samples and made a umap
using all cells from all 6 samples and in the UMAP
every sample has a different color (in total 6 colors) and the goal was to see how much overlap these samples have in the UMAP
. and do the follow up analysis and compare different samples. samples 1, 2 and 3 are from the condition A and samples 4, 5 and 6 are from the condition B. at the end I would compare conditions A vs B.
here is the UMAP
I mentioned above:
![enter image description here][1]
this UMAP
shows 5 samples (which are from both conditions) have a lot of overlap but sample 2 has different pattern.
to compare 2 conditions, I would like to combine all cells from all 6 samples as one and cluster all cells and see how the cells are clustered and detect the markers of each clusters like the following UMAP
:
in this UMAP
we have 3 clusters and it shows 5 samples (from both conditions) are in one cluster and sample 2 is in all 3 clusters. if I detect the markers, they will not be specific to any condition due to the pattern that sample 2 has.
if I remove the sample2 and make UMAP
using the other 5 samples, 4 samples would have a lot of overlap and sample 5 would have a quite different pattern (which shows different pattern are not due to condition). if I remove the sample 5 , I would encounter the same issue. if I want to do the procedure, I would have to remove all samples. therefore to be able to use all 6 samples and to detect condition specific markers,
what would be the solution?