I'm using scMultiom data with the CCAF tool to predict cell cycle phases. CCAF : https://github.com/plaisier-lab/ccAF CCAF requires a loom file as input. I converted the output h5 file from cellranger-arc and atac_fragment.tsv.gz to a loom file using Seurat's code.
library(Seurat)
library(Signac) library(EnsDb.Hsapiens.v86) library(dplyr) library(ggplot2) library(SeuratDisk)
the 10x hdf5 file contains both data types.
inputdata.10x <- Read10X_h5("D:/Halima's Data/Thesis_2/1_GD428_21136_Hu_REH_Parental/outs/filtered_feature_bc_matrix.h5")
extract RNA and ATAC data
rna_counts <- inputdata.10x$Gene Expression
atac_counts <- inputdata.10x$Peaks
Create Seurat object
pbmc <- CreateSeuratObject(counts = rna_counts) pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
Now add in the ATAC-seq data
we'll only use peaks in standard chromosomes
grange.counts <- StringToGRanges(rownames(atac_counts), sep = c(":", "-")) grange.use <- seqnames(grange.counts) %in% standardChromosomes(grange.counts) atac_counts <- atac_counts[as.vector(grange.use), ] annotations <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86) seqlevelsStyle(annotations) <- 'UCSC' genome(annotations) <- "hg38"
frag.file <- "D:/Halima's Data/Thesis_2/1_GD428_21136_Hu_REH_Parental/outs/atac_fragments.tsv.gz" chrom_assay <- CreateChromatinAssay( counts = atac_counts, sep = c(":", "-"), genome = 'hg38', fragments = frag.file, min.cells = 10, annotation = annotations ) pbmc[["ATAC"]] <- chrom_assay
pbmc <- subset( x = pbmc, subset = nCount_ATAC < 7e4 & nCount_ATAC > 5e3 & nCount_RNA < 25000 & nCount_RNA > 1000 & percent.mt < 20 )
RNA analysis
DefaultAssay(pbmc) <- "RNA" pbmc <- SCTransform(pbmc, verbose = FALSE) %>% RunPCA() %>% RunUMAP(dims = 1:50, reduction.name = 'umap.rna', reduction.key = 'rnaUMAP_')
ATAC analysis
We exclude the first dimension as this is typically correlated with sequencing depth
DefaultAssay(pbmc) <- "ATAC" pbmc <- RunTFIDF(pbmc) pbmc <- FindTopFeatures(pbmc, min.cutoff = 'q0') pbmc <- RunSVD(pbmc) pbmc <- RunUMAP(pbmc, reduction = 'lsi', dims = 2:50, reduction.name = "umap.atac", reduction.key = "atacUMAP_")
pbmc <- FindMultiModalNeighbors(pbmc, reduction.list = list("pca", "lsi"), dims.list = list(1:50, 2:50)) pbmc <- RunUMAP(pbmc, nn.name = "weighted.nn", reduction.name = "wnn.umap", reduction.key = "wnnUMAP_") pbmc <- FindClusters(pbmc, graph.name = "wsnn", algorithm = 3, verbose = FALSE)
5. Convert the Seurat object to a .loom file
Convert the Seurat object to a .loom file
reh.loom <- as.loom(pbmc, filename = "./output_reh_4.loom")
However, the resulting loom file seems corrupted and won't open in Python, even after multiple attempts. I even consulted ChatGPT, but the issue persists. Any assistance would be appreciated.
import scanpy as sc
3. Load up data
Load up loom file
print('\nLoading REH loom data...') adata3 = sc.read_loom("D:/Halima's Data/Thesis_2/RCode/output_reh_4.loom")
File h5py_objects.pyx:54, in h5py._objects.with_phil.wrapper()
File h5py_objects.pyx:55, in h5py._objects.with_phil.wrapper()
File h5py\h5f.pyx:106, in h5py.h5f.open()
OSError: Unable to open file (file signature not found)
How do I get the loom file so that I can read it on python code?
Might not be that to cause the corruption, but (from docu)
... Always remember to close loom files when done....pbmc.loom$close_all()
.Thanks !!! That worked!