In single-cell analysis, I know that the first steps typically involve filtering out dead cells based on mitochondrial genes, ribosomal genes, low count, and low feature numbers. However, due to the nature of my wet lab experiment, the number of dead cells is important for me. In this case, what should I do? I know that some cells will die due to the experimental conditions. But how can I determine whether the dead cells in my single-cell data are truly due to the experiment or caused by technical issues during the analysis?
A low-quality cluster is forming in the UMAP plot. However, how can I determine whether the cells in this cluster are truly showing a high mitochondrial ratio due to cell lysis in the droplets, or if they are cells in the early stages of apoptosis? If they are apoptotic cells—but I don't see apoptotic markers based on their transcriptions—I would like to keep them. But if they are due to technical issues and broken cells, I would like to exclude them.
If I keep this cluster despite its low quality, wouldn't it create noise during cell annotation and DEG analysis, making it harder to assign other cells correctly?
I don’t have a great answer to your question since I’ve always excluded cells that I think are dead/bad/low-quality/etc. However, since you’re doing a wet lab experiment, can’t you sequence dying cells and see what they would look like?
If you run FeaturePlot after UMAP as given in the code above, you will get the UMAP plot colored by mitochondrial percentage. You can also generate such plots using the gene names of some apoptotic marker genes. Nobody forces you to follow a "standard" to remove high MT cells if it isn't appropriate for your experiment. Indeed, in the sc workshop I attended, it was recommended to always leave them in and instead color them in the dimensionality reduction plot. The problem is rather with the experimental question. What could you possibly expect to learn from a dead cell? Therefore you shouldn't focus on them but the living ones that still have a coordinated stress response.
Btw, the reason why MT% is used as a filter is that in a dead cell, most other RNA is washed out through the ruptured plasma membrane and only the MT and plastid transcripts may remain because they are held inside the compartment.