For the CellRanger make counts pipeline, there is an optional parameter called expected cells where the default value is 3000. Is it okay to leave this as is, even if we would expect to recover somewhere between 7000-11000 cells per sample? What exactly would we be gaining by including this parameter, or losing by excluding?
Thanks for sharing. Indeed I was also curious to know the impact of this parameter, and have tested for this myself. I have run it with default parameter and with
--expect-cells=10000
and compared the results. I got similar statistics in the Summary page ofweb_summary.html
, although the Barcode Rank Plot showed higher cell-associated barcodes when the parameter is set to 10000. However, I got slightly different-looking clustering and totally different list of Top Features by Cluster (differentially expressed genes) in the Analysis page. And noticed that log2FC values are much higher for the genes when the parameter is set to default. I am quite concerned that this difference might affect the downstream analyses and conclusions drastically.You can try scRNA-Seq pipeline by 10x cellranger software on CSI NGS Portal.