Hi all,
If anyone familiar with using Tabula Sapiens or scanpy could address a question I'd be grateful. I'm looking into using Tabula sapiens (10x genomics data across several organs) to check out some cell-specific markers but finding that some of them appear to have their expression values capped. Looks like a small proportion of cells overall have expression values of 10.00 which leads to some funny looking distributions e.g. the below violin plot of normalised expression values for ANKRD1 - a gene which has these values capped at 10.00 in Tabula sapiens:
Hard to tell from the manuscript but seems likely the data was normalised with scanpy, which I haven't used before myself. Would this be the source of the capped data? Not seeing any info anywhere on why this data looks like this. It seems likely to me to lead to some quite skewed diff. expression results and logFC values...