Reading files from different types as H5AD
1
I want to download scRNA-seq counts data from GEO, eventually I want to have a H5AD file.
Usually for counts data, I download three files: barcodes, features, and matrix.
However, some datasets in GEO provide different kind of files which are TSV or TXT files.
For example, this provides counts along with some info about the cells, or this it provides only counts TXT files for the first 5 samples (which are the scRNA-seq I want).
Using scanpy, can I form H5AD files from these ?
Thanks!
python
scanpy
anndata
single-cell
• 539 views
•
link
updated 6 months ago by
bk11
★
3.0k
•
written 6 months ago by
JACKY
▴
170
I am showing here how to save GSM6506112_SP3
sample count matrix from GEO (GSE211956 ) into h5ad.
import pandas as pd
import scanpy as sc
import os
results_file = "GSE211956_RAW/GSM6506112_SP3.h5ad"
os.listdir("GSE211956_RAW/")
['GSM6506112_SP3_features.tsv.gz',
'GSM6506112_SP3_barcodes.tsv.gz',
'GSM6506112_SP3_matrix.mtx.gz']
adata=sc.read_mtx('GSE211956_RAW/GSM6506112_SP3_matrix.mtx.gz')
adata_bc=pd.read_csv('GSE211956_RAW/GSM6506112_SP3_barcodes.tsv.gz', header=None)
adata_features=pd.read_csv('GSE211956_RAW/GSM6506112_SP3_features.tsv.gz',header=None, sep='\t')
adata= adata.T
adata_features
0 1 2
0 ENSG00000243485 MIR1302-2HG Gene Expression
1 ENSG00000237613 FAM138A Gene Expression
2 ENSG00000186092 OR4F5 Gene Expression
3 ENSG00000238009 AL627309.1 Gene Expression
4 ENSG00000239945 AL627309.3 Gene Expression
... ... ... ...
33533 ENSG00000277856 AC233755.2 Gene Expression
33534 ENSG00000275063 AC233755.1 Gene Expression
33535 ENSG00000271254 AC240274.1 Gene Expression
33536 ENSG00000277475 AC213203.1 Gene Expression
33537 ENSG00000268674 FAM231C Gene Expression
33538 rows × 3 columns
adata.var['gene_id']= adata_features[1].tolist()
adata
AnnData object with n_obs × n_vars = 1504 × 33538
var: 'gene_id'
adata.write_h5ad(results_file)
os.listdir("GSE211956_RAW/")
['GSM6506112_SP3_features.tsv.gz',
'GSM6506112_SP3.h5ad',
'GSM6506112_SP3_barcodes.tsv.gz',
'GSM6506112_SP3_matrix.mtx.gz']
•
link
6 months ago by
bk11
★
3.0k
Login before adding your answer.
Traffic: 2558 users visited in the last hour