Due to the drop-out issue, in the expression matrix, there are a lot of zero counts in the sc-RNA Seq. But if without drop-out, what value of the proportion of zero counts should be reasonable? I couldn't find any knowledge on this. Does anyone have any idea on this?
Unfortunately the answer to this would be very organism and cell type specific. It would also depend on the depth of sequencing that you performed. From normal RNA-seq from a population of cells there are often 1/3 to 1/2 of genes that are not expressed at levels above noise. I would think that if you have sc-RNA seq from a homogeneous cell population, you should be able to distinguish between some of the genes that are either dropped frequently or never expressed based on how many cells you get reads from. Another and more painful process is if there are genes that either should or should not be expressed in your cell type, you can use those to help define the expression threshold.
These methods should be useful when evaluating the cluster/ imputation methods. However, I hope to know if there is an approximate zero count rates. Normally, in sc-RNA seq, zeros occupy 80% due to drop-out issue. But the zero expression rate of the true data is not known.
These methods should be useful when evaluating the cluster/ imputation methods. However, I hope to know if there is an approximate zero count rates. Normally, in sc-RNA seq, zeros occupy 80% due to drop-out issue. But the zero expression rate of the true data is not known.