I'm attempting to perform differential expression analysis using the FindAllMarkers
function included in Seurat
. I'd like to use the DESeq2
function as I've had good results on bulk RNAseq using that package in the past. However, when I try to run it, it throws an error since there are so many zero values in my counts matrix. Would it be statistical malpractice to simply add 1 to each matrix entry? I'm pretty sure this is done in differential expression analysis of bulk RNAseq but it's been a while so I can't remember.
You should remove genes with zero counts across the board (i.e. no expression in any sample). That removes irrelevant data points and is not "statistical malpractice"
EDIT: my statement applies to bulk RNAseq. I’m not sure if it’s relevant to scRNA-Seq. Sorry!
Do you know if there's any way to do this within Seurat? This post seems to indicate that they don't have any gene filtering functionality in their package. Or could you recommend another way to remove lowly expressed genes?
I edited my comment. I’m not sure my comment on DESeq2 applies to single cell experiments.
There are tools out there that impute values for zeros in RNAseq. I'm not an expert, so I make no claims about how well they work, or which is best, but you might like to check out:
bayNorm
scImpute
SAVER
MAGIC