How to create a Seurat object from the `xxx_metadata.txt.gz ` and `xxx_counts.txt.gz` files?
2
0
Entering edit mode
17 months ago
Sun • 0

Dear all,

I downloaded two files of scRNA-seq data from GEO, including xxx_metadata.txt.gz and xxx_counts.txt.gz files. I want to load these files and create a Seurat object.

The xxx_metadata.txt.gz file can be read by:

metadata <- read.table(gzfile("xxx_metadata.txt.gz"), header = TRUE, sep = "\t")

However, when use read.table to read the counts file:

counts <- read.table(gzfile("xxx_counts.txt.gz"), header = TRUE, sep = "\t", row.names = 1)

I found that it is extremely slow and consume a large amount of RAM. The "xxx_counts.txt.gz" file is about 2 G, but reading it with read.table does not finish even when 10 G RAM is taken up.

Could you tell me how to create a Seurat object based on these two files? I will apperiate deeply if you could help.

Seurat • 3.1k views
ADD COMMENT
0
Entering edit mode

How big are these files? My bed guess is that this is a matrix. If there is more than one sample you would have to deal with that separately, as well as to know if the data is normalized already or it is raw counts. Either way, you can create a Seurat object after you have loaded the matrix into R using

library(Seurat)

seuratObj <- CreateSeuratObject (counts=counts)

Also, you maybe can reduce RAM consumption by using a sparse matrix.

ADD REPLY
0
Entering edit mode
17 months ago
Haci ▴ 730

In order to create a Seurat object, you would need to use one of the functions here that start with Read: https://satijalab.org/seurat/reference/

The function to use will depend on how the data is generated, for example for 10x that would be Read10X().

As for the metadata, you can check AddMetaData()

ADD COMMENT
0
Entering edit mode

Read10X does not work because the barcode.txt.gz file is missing.

ADD REPLY
0
Entering edit mode

@Sun, you are right, I missed the part that you only got a count file. In that sense, I think you can give fread() from the data.table package a try to read the count data. Then you can convert this to a sparse matrix before feeding into CreateSeuratObject() as biofalconch suggested above.

ADD REPLY
0
Entering edit mode
17 months ago

I'd find some sample 10X output files, and make a dummy barcode file. It doesn't really matter what the barcodes are, they are just names for cells, so it doesn't matter if they aren't the "real" barcode sequences.

ADD COMMENT

Login before adding your answer.

Traffic: 2010 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6