My samples are patient-derived xenografts that contain both human epithelial cancer cells and mouse fibroblast/blood cells. I run cellranger count
using the human and mouse combined reference genome, which was downloaded from 10xgenomics (https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-and-mm10-2020-A.tar.gz). After Leiden clustering, the human and mouse cells can be separated by the marker genes found with scanpy.tl.rank_genes_groups
. Then the data can be separated into human cells and mouse cells. The mouse genes in human cells were removed and the raw data of human cells can be re-clustered.
I am wondering if is this the correct practice for the analysis.
Is it possible to filter out the human cell from the fastq
data and run cellranger count
using only human reference genome?
Thanks.
One option is to use Sargasso (https://github.com/biomedicalinformaticsgroup/Sargasso) in order to split the raw FASTQ reads based on their respective species. I have used this program before, but I am unsure of its usage in scRNA-seq.