I'm having some issues with the alignment of 10x scRNAseq data to a custom genome and I thought it would be good to do a stricter filtering on the cells so more cells get filtered out as empty droplets. Apparently STARsolo has several filtering algorithms, of which the EmptyDrops one is most similar to the CellRanger filtering. The manual lists 10 different parameters, but I can't make sense of what they mean and there's no further explanation.
In STARsolo, this filtering can be activated by:
--soloCellFilter EmptyDrops_CR
. It can be followed by 10 numeric parameters: nExpectedCells (3000), maxPercentile (0.99), maxMinRatio (10), indMin (45000), indMax (90000), umiMin (500), umiMinFracMedian (0.01), candMaxN (20000), FDR (0.01), simN (10000).
Now if I just want a stricter threshold on e.g. UMIs/cell for the filtering (see picture), should I increase umiMin or also set anything else? If someone can provide a full explanation of the parameters that would also be great.
what should the command look like Rob? I tried
and I am getting the error
May I ask how is that going on cause I'm having the same problems. I was playing with a single nuclei dataset of four teleost samples. Thought it will be better to do a stricter filtering, but the result with
--soloCellFilter EmptyDrops_CR
showed a big differences between my previous alignment with--soloCellFilter TopCells 25000
. TopCells identified around 25,000 cells in each samples, while EmptyDrops_CR believe there are 7,000 cells in one of my samples and the rest three contains around 25,000 cells. Should I change any numeric parameters with EmptyDrops_CR or could it just be the porblem of that specific sample?