Hi, I'm having a bit of trouble identifying how best to integrate single cell datasets. I have a time series of 3 developmental timepoints, d5, d10 and d15 with 2 replicates at each time point. I'm working my way through seurat integration, harmony and fast mnn. Should I merge the 2 replicates for each time point and then make another merged object of all timepoints and then perform the different integration methods or should I be making a list of the objects and then integrating them?
With the seurat integration, I was able to do both methods but I'm thinking integrating as a list might be preferred but for the fast mnn and harmony I'm thinking I have to merge the replicates, then all time points and then normalize, find variables and then integrate using the relevant packages.
If anyone can provide any insight about how replicates and several timepoints should be treated prior to integration that'll be much appreciated. Thank you!
Did you ever get any advice on this? I have a dataset organized in a similar way
If you're using Seurat v5, the count and data matrices will be kept as separate layers in the object slots by default so you don't need to split the data as a list. It will be kept separate unless you generate a scale.data matrix from something like SCTransform. In v5, normalisation is done on individual layers by default and then integration generates the downstream results.