Entering edit mode
2.3 years ago
Emily
▴
70
Most of the scRNA tutorials I've seen are read count data (mtx, feature, barcode etc) that are in tar.gz format but the count matrix I have is in csv which already is in gene (row) by cell (column). Do I need to convert it to AnnData to run this on Scanpy?
You can load a text matrix directly using
scanpy.read_text
. You should check the documentation, as scanpy, like seurat, supports multiple input types.I checked the documentation and scanpy can also read csv using scanpy.read_csv (or sc.read_csv) but when run the code it doesn't return var name just AnnData object dimension Should show output as: AnnData object with n_obs x n-vars... var: 'gene_ids'
code I ran: adata = sc.read_csv('~/filename.csv') adata.var_names_make_unique() adata
How does your csv looks like? Are the genes stored in the first column of the files? Scanpy assumes that the gene names are there:
first_column_names : Optional[bool] (default: None) Assume the first column stores row names.
first column is cell index (literally named as 'Cell_Index'), genes start on second column and beyond in the original file so I transposed it.
My Code: mat = pd.read_csv( '~/filename.csv', skiprows=7, index_col=False) # first 7 rows are receipts so deleted it
df= mat.transpose() df.columns = df.iloc[0] df = df[1:] df.head() df.to_csv(...)
df_1 = pd.read_csv('~/filename.csv') df_1.rename(columns={ df_1.columns[0]: "gene_ids" }, inplace = True) df_1.to_csv(...)
after running that my first column is 'gene_ids' with genes under that column, second column being the id of cell index Result Output https://imgur.com/a/s3iOgtK
by the way the original file didn't even have barcode.tsv, gene.tsv, matrix.mtx... just that one csv file