Dear experts,
Is there a way to decide which is the optimal number of features to use when integrate multi samples by Seurat in scRNAseq?
The default number in SelectIntegrationFeatures
is 2000, should I try a gradient like 2000, 5000, 7000, 10000, to decide this number?
The concern for this is that when find the DEGs for a specific cluster, the min.pct set to 0.5, pct.2 is passed this thread, but still can see in pct.1 is quite low, when the pct.1 and pct.2 have 1000 cells, pct.1 only have 10 cells and pct.2 have 717 cells for C15orf48. In some clusters, cell number can be only near 200, in this case, how to make sure or confident to say, a gene is deferentially expressed.
Thank you very much!
The number of feature selected during intergation are mainly there to save computational resources, and it won't affect DE analysis, as you will not test on the integrated assay. 2000 or 3000 genes are fine.
Thanks a lot!
May I know how to do the enrichment analysis for the DEGs of each cluster?
As the DEGs can be varies when the feature numbers set to 2000, and logfc.threshold = 0, the whole results are near 2000, when nfeatures = 5000, and logfc.threshold = 0, then the whole results for a cluster are near 3000; it might be possible that 2000 and 3000 genes may lead to relatively big difference.
Or DEenrichmentRPlot is the best way to do the enrichment analysis?
Thanks a lot!
What does "consersion" mean? If it's a typo, I cannot figure out what you actually meant to use there.
Thanks a lot for pointing out this. It should be concern instead. Best!
Thanks for fixing it. I'll delete this comment thread soon just to clean up the post.