Hi all,
I am analyzing single nucleus RNA-seq data using Seurat. And I have total four group and 24 samples (Brain region A Control & case and Brain region B Control & case; each n=6). And I used Seurat integration function for analyzing multiple samples. However, DoubletFinder could not be applied to integration data. This is because the normalization is performed for each sample before integration.
So I have a few question. 1) Is it the right way to apply integrate after DoubletFinder for each sample? 2) Would it be okay to remove the top 5% nCount_RNA per sample instead of using DoubletFinder? (ex. Sample A < 12000, Sample B < 11000 ...)
Or is there a good way to apply DoubletFinder on integration data?
Thank you!
I personally think doublet detection should be performed for all these samples together that were processed in a single tube during library prep. So if you have 10X data then all cells that were loaded into the same well during GEM formation. If you had samples in different wells then it is impossible that there are doublets between these two, regardless what any tool tells you computationally.
So how do I apply DoubletFinder (or other tools like a scrublet) correctly when using surat integration?
Integration and doublet identification are two separate things. As doublets can only occur within a sample (and within a sample there is no integration) you should imo perform doublet detection per sample independent of integration.