Adata.raw.X in LIANA, something wrong with conversion from Seurat to adata in python.
0
0
Entering edit mode
6 months ago

Hi

I am currently working in this tool called LIANA+ (https://liana-py.readthedocs.io/en/latest/notebooks/basic_usage.html) which is checking potential cell-cell interactions. The input for this tool is adata, but since I did my single-cell analysis in Seurat, I had to convert my processed Seurat Object to adata in python. this was my processed SeuratObject:

An object of class Seurat 
37661 features across 29005 samples within 2 assays 
Active assay: SCT (15723 features, 3000 variable features)
 3 layers present: counts, data, scale.data
 1 other assay present: RNA
 2 dimensional reductions calculated: pca, umap

I did the conversion like this, but I was not pretty sure about this. In R

library(Matrix)
# write matrix data (gene expression counts) 
# i want the normalized counts 
counts_matrix <- GetAssayData(data, assay='SCT', slot='counts')
writeMM(counts_matrix, file=paste0(file='/lustre1/project/stg_00079/students/soniya/seurat/matrixraw.mtx'))
# write dimensional reduction matrix (PCA)
write.csv (SRT@reductions$pca@cell.embeddings, 
           file='/lustre1/project/stg_00079/students/soniya/seurat/pca.csv', quote=F, row.names=F)
library(dplyr)
# write gene names
write.table(data.frame('gene'=rownames(counts_matrix)),
            file='/lustre1/project/stg_00079/students/soniya/seurat/gene_names.csv',
            quote=F,row.names=F,col.names=F)
SRT$barcode <- colnames(SRT)
SRT$UMAP_1 <- SRT@reductions$umap@cell.embeddings[,1]
SRT$UMAP_2 <- SRT@reductions$umap@cell.embeddings[,2]
write.csv(SRT@meta.data, file='/lustre1/project/stg_00079/students/soniya/seurat/metadata.csv', quote=F, row.names=F)

further in python:

X = io.mmread("/lustre1/project/stg_00079/students/soniya/seurat/matrix.mtx")
adata = anndata.AnnData(X=X.transpose().tocsr())
metadata = pd.read_csv("/lustre1/project/stg_00079/students/soniya/seurat/metadata.csv")
with open("/lustre1/project/stg_00079/students/soniya/seurat/gene_names.csv", 'r') as f: 
    gene_names = f.read().splitlines()
adata.obs = metadata
adata.obs.index = adata.obs['barcode'] #adata.obs_names
adata.var.index = gene_names #adata.var_names
pca = pd.read_csv("/lustre1/project/stg_00079/students/soniya/seurat/pca.csv")
pca.index = adata.obs.index
adata.obsm['X_pca'] = pca.to_numpy()
adata.obsm['X_umap'] = np.vstack((adata.obs['UMAP_1'].to_numpy(), adata.obs['UMAP_2'].to_numpy())).T
sc.pl.umap(adata,color= ['cluster_labels'],frameon = False, save = True)
adata.write("/lustre1/project/stg_00079/students/soniya/seurat/SRT.h5ad")

And so my adata object is saved in this SRT.h5ad file. For my count matrix I did this line of code a seen above: counts_matrix <- GetAssayData(data, assay='SCT', slot='counts')

I took SCT, and counts, but I'm not sure if I should take them? Should I maybe take the RNA, counts layer? I can't derive it from this tool. Also when I do adata.raw.X, the documentation says that this should be the "liana typically works with the log1p-trasformed counts matrix, in this object the normalized counts are stored in raw" but when I read my SRT.h5ad I don't have any adata.raw.X

Does anyone know where I went wrong?

k • 727 views
ADD COMMENT
0
Entering edit mode

If you are using R, you should check the liana R implementation (https://github.com/saezlab/liana).

ADD REPLY
0
Entering edit mode

Hi, no I'm using the python implementation as linked above.

ADD REPLY
0
Entering edit mode

I saw that, but considering that your analysis is in seurat and that both liana and liana+ are from the same developers, it might be quicker to stick with R.

ADD REPLY
0
Entering edit mode

the python version is a more efficient and faster implementation, it says. I'm just not sure which assay and which layer to use?

ADD REPLY

Login before adding your answer.

Traffic: 1744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6