I am trying to bring this small dataset (1496 cells, 43MB) from cellxgene.
1 R SeuratDisk
h5ad to h5seurat and then to seurat.
library(SeuratDisk)
SeuratDisk::Convert("local.h5ad", dest = "local.h5seurat", overwrite=TRUE)
g <- SeuratDisk::LoadH5Seurat("local.h5seurat", meta.data=FALSE, misc=FALSE)
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding feature-level metadata for RNA
Error: Missing required datasets 'levels' and 'values'
or another error
Validating h5Seurat file
Initializing RNA with data
Error in if ((lp <- length(p)) < 1 || p[1] != 0 || any((dp <- p[-1] - :
missing value where TRUE/FALSE needed
In addition: Warning message:
In sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], :
NAs introduced by coercion to integer range
Converts to h5seurat, but runs into error when reading in the h5seurat file.
2 R anndata
library(anndata)
g <- read_h5ad("local.h5ad") # crashes rstudio locally
Error in py_call_impl(callable, dots$args, dots$keywords) :
anndata._io.utils.AnnDataReadError: Above error raised while reading key '/layers' of type <class 'h5py._hl.group.Group'> #from /.
Fails for whatever reason.
3 R SCP
h5ad to single-cell-experiment.
# remotes::install_github("zhanghao-njmu/SCP",upgrade="never")
library(SCP)
library(reticulate)
sc <- import("scanpy") # crashes rstudio locally
adata <- sc$read_h5ad("local.h5ad")
srt <- adata_to_srt(adata)
R crashes at line 3. Something to do with reticulate I assume.
4 R Zellkonverter
# remotes::install_github("theislab/zellkonverter",upgrade="never")
library(zellkonverter)
g <- readH5AD("local.h5ad", verbose = TRUE, reader = "python")
ℹ Using the Python reader
ℹ Using anndata version 0.8.0
sh: 5: /home/roy/miniconda3/envs/r-4.1/etc/conda/deactivate.d/udunits2-deactivate.sh: [[: not found
sh: 5: /home/roy/miniconda3/envs/r-4.1/etc/conda/deactivate.d/geotiff-deactivate.sh: [[: not found
sh: 5: /home/roy/miniconda3/envs/r-4.1/etc/conda/deactivate.d/gdal-deactivate.sh: [[: not found
sh: 11: /home/roy/miniconda3/envs/r-4.1/etc/conda/deactivate.d/gdal-deactivate.sh: [[: not found
sh: 4: /home/roy/miniconda3/envs/r-4.1/etc/conda/deactivate.d/deactivate-r-base.sh: [[: not found
sh: 5: /home/roy/miniconda3/envs/r-4.1/etc/conda/deactivate.d/deactivate-gxx_linux-64.sh: Syntax error: "(" unexpected
Warning message:
In system(paste(act.cmd, collapse = " "), intern = TRUE) :
running command '. '/home/roy/.cache/R/basilisk/1.4.0/0/etc/profile.d/conda.sh' && conda activate && /home/roy/miniconda3/envs/r-4.1/lib/R/bin/Rscript --no-save --no-restore --no-site-file --no-init-file --default-packages=NULL -e "con <- socketConnection(port=11303, open='wb', blocking=TRUE);serialize(Sys.getenv(), con);close(con)"' had status 2
Fails when using reader as Python. Conda issue?
library(zellkonverter)
g <- readH5AD("local.h5ad", verbose = TRUE, reader = "R")
ℹ Using the R reader
✔ Reading local.h5ad [3.9s]
Warning message:
In value[[3L]](cond) :
setting 'colData' failed for 'local.h5ad': cannot coerce class
"list" to a DataFrame
Fails when using reader as R.
5 Python scanpy
Tried scanpy in python. I don't know any python, so just tried two lines of basic code.
import scanpy as sc
g = sc.read_h5ad("local.h5ad")
Traceback (most recent call last):
File "/home/roy/miniconda3/lib/python3.8/site-packages/anndata/_io/utils.py", line 156, in func_wrapper
return func(elem, *args, **kwargs)
File "/home/roy/miniconda3/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 510, in read_group
EncodingVersions[encoding_type].check(
File "/home/roy/miniconda3/lib/python3.8/enum.py", line 387, in __getitem__
return cls._member_map_[name]
KeyError: 'dict'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/roy/miniconda3/lib/python3.8/site-packages/anndata/_io/h5ad.py", line 413, in read_h5ad
d[k] = read_attribute(f[k])
File "/home/roy/miniconda3/lib/python3.8/functools.py", line 875, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
File "/home/roy/miniconda3/lib/python3.8/site-packages/anndata/_io/utils.py", line 162, in func_wrapper
raise AnnDataReadError(
anndata._io.utils.AnnDataReadError: Above error raised while reading key '/layers' of type <class 'h5py._hl.group.Group'> from /.
Not really sure if this related to the input file or just a lot of random issues. How does one import cellxgene dataset to R? Any other tools/solution? Maybe a web app or a docker container for this? All I want is to just get the raw counts and metadata out.
Using R 4.0.1 and/or R 4.1.1. Python 3.8.13.
If there is no specific reason for trying to download the h5ad file, downloading the rds file is rather straightforward and should help you to access raw counts and metadata.
Large datasets do not have the Rds download as an option.
I am trying use the same data and same method. But when I read the csv expression matrix in R, it frozen. Could you share your final seurat.Rds file. That would help many people. Thanks.