Hello I wish to cluster my single cell 10x data using RaceID3. However, I cannot load my 10x data into RaceID using their function SCseq
10x gave me 3 files: 1) barcodes.tsv.gz 2) features.tsv.gz 3) matrix.mtx.gz
I used Seurat's Read10X function :
library(Seurat)
library(RaceID)
pbmc.data <- Read10X(data.dir = "C:/Users/s/Downloads/")
sc <- SCseq(pbmc.data)
Here is my pbmc.data
> pbmc.data
33694 x 27179520 sparse Matrix of class "dgCMatrix"
This is the error i get :
sc <- SCseq(pbmc.data)
Error in asMethod(object) : Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
I also tried using the Matrix package in R.
library(Matrix)
matrix_dir = "C:/Users/s/Downloads/"
barcode.path <- paste0(matrix_dir, "barcodes.tsv.gz")
features.path <- paste0(matrix_dir, "features.tsv.gz")
matrix.path <- paste0(matrix_dir, "matrix.mtx.gz")
mat <- readMM(file = matrix.path)
feature.names = read.delim(features.path,
header = FALSE,
stringsAsFactors = FALSE)
barcode.names = read.delim(barcode.path,
header = FALSE,
stringsAsFactors = FALSE)
colnames(mat) = barcode.names$V1
rownames(mat) = feature.names$V1
And it fails to allocate a huge amount of memory
> sc <- SCseq(mat)
Error: cannot allocate vector of size 6823.1 Gb
I understand that RaceID requires a sparse matrix which I am already providing. Can someone please explain?
Thank you for your reply @Devon Ryan
The count matrix I am using is currently a "cellranger aggregate" of 4 different samples. I tried using Seurat for clustering but since my samples are cell culture samples with differing conditions, they do not cluster well.
I could run this on a single sample but then it would defeat the purpose of identifying cell lineages.
Could you please recommend any other tools I can use to effectively do this? Currently trying out Slingshot.
Wishing you a Happy New Year and Decade!
Play with the parameters in Seurat more, including how you're dealing with batches (i.e., samples). You can also try things like scanorama and scanpy.