Dear all,
I downloaded two files of scRNA-seq data from GEO, including xxx_metadata.txt.gz
and xxx_counts.txt.gz
files. I want to load these files and create a Seurat object.
The xxx_metadata.txt.gz
file can be read by:
metadata <- read.table(gzfile("xxx_metadata.txt.gz"), header = TRUE, sep = "\t")
However, when use read.table
to read the counts file:
counts <- read.table(gzfile("xxx_counts.txt.gz"), header = TRUE, sep = "\t", row.names = 1)
I found that it is extremely slow and consume a large amount of RAM. The "xxx_counts.txt.gz" file is about 2 G, but reading it with read.table
does not finish even when 10 G RAM is taken up.
Could you tell me how to create a Seurat object based on these two files? I will apperiate deeply if you could help.
How big are these files? My bed guess is that this is a matrix. If there is more than one sample you would have to deal with that separately, as well as to know if the data is normalized already or it is raw counts. Either way, you can create a Seurat object after you have loaded the matrix into R using
library(Seurat)
seuratObj <- CreateSeuratObject (counts=counts)
Also, you maybe can reduce RAM consumption by using a sparse matrix.