Hi,
I am a little lost here. I am selecting cells that express certain genes at the same time and I call them group one. these happened to be 90 cells. and then I want to compare them with the rest of the cells which are 30,000 cells or 2,000 cells. Seurat gene differential does work but I fear that because the larger group will have more zero values and thus will hold the group down and provide incorrect results. what is a decent way to compare these two groups. What algorithm is there that I can use. I heard that MAST is robust but I struggle to understand these tools mathematical workings. Your guidance is appreciated.
I would keep it transparent. Do DE by subsetting large to small group. Do that randomly many times, then either average stats or use some sort of meta-analysis such as RRA to get a single pvalue.
I would probably use a pseudobulk approach.
Preferred if you have true biological replicates. Can still be combined with my subsetting strategy, like use 100x different cells for the bulks, then aggregate all these results into a single one. I am a bit worried that so few cells compared to so many result in different technical dropouts (zeros) for many genes, so subsampling would somewhat compensate that between groups (I think).
Best would obviously be to do a better experimental design and somehow enrich for the low-abundant or deplete the high-abundant cells.