Hi
I am currently working in this tool called LIANA+ (https://liana-py.readthedocs.io/en/latest/notebooks/basic_usage.html) which is checking potential cell-cell interactions. The input for this tool is adata, but since I did my single-cell analysis in Seurat, I had to convert my processed Seurat Object to adata in python. this was my processed SeuratObject:
An object of class Seurat 37661 features across 29005 samples within 2 assays Active assay: SCT (15723 features, 3000 variable features) 3 layers present: counts, data, scale.data 1 other assay present: RNA 2 dimensional reductions calculated: pca, umap
I did the conversion like this, but I was not pretty sure about this. In R
library(Matrix)
# write matrix data (gene expression counts)
# i want the normalized counts
counts_matrix <- GetAssayData(data, assay='SCT', slot='counts')
writeMM(counts_matrix, file=paste0(file='/lustre1/project/stg_00079/students/soniya/seurat/matrixraw.mtx'))
# write dimensional reduction matrix (PCA)
write.csv (SRT@reductions$pca@cell.embeddings,
file='/lustre1/project/stg_00079/students/soniya/seurat/pca.csv', quote=F, row.names=F)
library(dplyr)
# write gene names
write.table(data.frame('gene'=rownames(counts_matrix)),
file='/lustre1/project/stg_00079/students/soniya/seurat/gene_names.csv',
quote=F,row.names=F,col.names=F)
SRT$barcode <- colnames(SRT)
SRT$UMAP_1 <- SRT@reductions$umap@cell.embeddings[,1]
SRT$UMAP_2 <- SRT@reductions$umap@cell.embeddings[,2]
write.csv(SRT@meta.data, file='/lustre1/project/stg_00079/students/soniya/seurat/metadata.csv', quote=F, row.names=F)
further in python:
X = io.mmread("/lustre1/project/stg_00079/students/soniya/seurat/matrix.mtx")
adata = anndata.AnnData(X=X.transpose().tocsr())
metadata = pd.read_csv("/lustre1/project/stg_00079/students/soniya/seurat/metadata.csv")
with open("/lustre1/project/stg_00079/students/soniya/seurat/gene_names.csv", 'r') as f:
gene_names = f.read().splitlines()
adata.obs = metadata
adata.obs.index = adata.obs['barcode'] #adata.obs_names
adata.var.index = gene_names #adata.var_names
pca = pd.read_csv("/lustre1/project/stg_00079/students/soniya/seurat/pca.csv")
pca.index = adata.obs.index
adata.obsm['X_pca'] = pca.to_numpy()
adata.obsm['X_umap'] = np.vstack((adata.obs['UMAP_1'].to_numpy(), adata.obs['UMAP_2'].to_numpy())).T
sc.pl.umap(adata,color= ['cluster_labels'],frameon = False, save = True)
adata.write("/lustre1/project/stg_00079/students/soniya/seurat/SRT.h5ad")
And so my adata object is saved in this SRT.h5ad file. For my count matrix I did this line of code a seen above: counts_matrix <- GetAssayData(data, assay='SCT', slot='counts')
I took SCT, and counts, but I'm not sure if I should take them? Should I maybe take the RNA, counts layer? I can't derive it from this tool. Also when I do adata.raw.X, the documentation says that this should be the "liana typically works with the log1p-trasformed counts matrix, in this object the normalized counts are stored in raw" but when I read my SRT.h5ad I don't have any adata.raw.X
Does anyone know where I went wrong?
If you are using R, you should check the liana R implementation (https://github.com/saezlab/liana).
Hi, no I'm using the python implementation as linked above.
I saw that, but considering that your analysis is in seurat and that both liana and liana+ are from the same developers, it might be quicker to stick with R.
the python version is a more efficient and faster implementation, it says. I'm just not sure which assay and which layer to use?