Hi everyone. I have one question. I already read about the difference between merging and integrate function in Seurat, but I want some advice about my specific case. I have samples from two different patients and for each one I have the control and the "treated". Like this
Control Treatment
Patient A 1Ac 1At
Patient B 1Bc 1Bt
The "treated" sample is represented by a subpopulation of the control one, since the only difference with the control is that the cells have been selected for a marker expressed just by a fraction of the total population. 1)I started analyzing the two patients independently, just looking at the clustering of the treated with respect to the own control. In this case, to combine the two samples (1Ac vs 1At ; 1Bc vs 1Bt) I used the function "merge". The clustering was pretty good for both the patients.
2)I moved on looking at the controls from the two patients to check if the samples were similar or not (1Ac vs 1Bc). I used "merged" but the two patients clustered in a separate way apparently not having nothing in common. At this point, I discovered about the "integrating function", that seems to be more appropriate when you are dealing with differences that could be due to natural patient variability. I applied it and the clustering was much more better even if I could still see differences between the two patients in the distribution among the clusters.
My question is: do I have to apply the "integrate" function instead of "merge" also when I study each patient independently (1Ac vs 1At ; 1Bc vs 1Bt) if I decide to present all these data together? Is it accepted to have a different way of combining the datasets according to the analysis level? Of course the clustering changes a bit but I don't think it is necessary to apply a sort of intra-patient batch correction (at least looking at the UMAP).
Sorry for the long post, but I'm just starting to approach single cell analysis and I have a lot to learn.
Thanks
Francesca
Thanks. I started with merge because I was following the pipeline I saw in a paper, but it makes sense what you are talking about. I absolutely agree with you about integrating different donors, but in the case of cells deriving from the same patient in which the only difference is a selection using a markers expressed by just a part of the original population, I thought that the "integrate" function could "over correct". But I will try and check how much the results differ. Thanks.