Hey ya'll,
I'm working on a scRNA-seq project using publicly available data in ScanPy. I am stuck on, I guess, a QC step of filtering out cells. These scatter plots were generated.
I'm having trouble interpreting why there's two bunches of cells in the bottom graph? Especially the bottom bunch with low n_gene_by_counts and higher total_count? Anyone have a clue or idea what they could be? or how to look into them further? Help, please?
Could someone explain how to interpret these graphs, please?
Cells with many counts but very few genes, maybe damaged cells with poor capture of transcripts. Can you check whether these are ribosomal genes that are on the separating there on the bottom of plot 2?
Thank you ATpoint
This really helped me learn more about ScanPy.
This tutorial helped me too:
https://nbisweden.github.io/workshop-scRNAseq/labs/compiled/scanpy/scanpy_01_qc.html
So this is the percentage of counts for ribosomal genes and hemoglobin genes:
From your experience, where would you make the cut off for this dataset? It's a human fetal pancreas dataset.
I did the cut-off like so:
But I went from an original ~9000 cells to 156 cells! I guess there was actually this much damage?