Hello,
I have recently followed adopted the Harvard Chan Bioinformatics Core guidelines for SC QC/Normalization/Clustering (https://hbctraining.github.io/scRNA-seq_online/schedule/links-to-lessons.html). I have integrated CD4+/CD8+ T cells from two time points.
I recently received feedback that my integrated dimension reduction plot clustering looked problematic. Specifically, the small clusters peripheral (splash/star?) and the number of distinct clusters.
Data was normalized using SCTransform, variables regressed were mitochondrial ratio and G2M-S phase score difference, as suggested for differentiating cell types. Alternative Workflow: https://satijalab.org/seurat/articles/cell_cycle_vignette.html
My clusters were called at 40 PC's w/ 0.6 resolution.
As for the number of clusters, TCR B VDJ subgenes were identified as strong conserved markers in several clusters. I wonder if it is worth excluding VDJ markers from analysis?
Any comment on the appearance of the dim plot and implications would be appreciated. Thank you!
Is this an integrated dataset? Did you run the Seurat integration routine? Otherwise it is almost certain that much of the cluster separation is due to the batch effects between the samples and time points.
What does that mean, please elaborate?
Thank you ATpoint. This is integrated data. I used SCtransform. Samples 1 and 2 were replicates of the same time point. I have included the code below.
As far as the appearance of the plot. I am paraphrasing the feedback, since I was confused. I think the expectation one less delineation between fewer clusters, and less separation between clusters. Also, that cluster 7 has satellite clusters.