Question

Need HVGs from concatenated spatial seq data to integrate CODEX with scRNAseq

0

Entering edit mode

2.3 years ago

Arta • 0

Hi, all.

I need to integrate CODEX data with 5' and 3' sequenced scRNA-seq data. The scRNA-seq datasets have already been integrated with each other.

I want to use a tool named GLUER. However, this tool requires a key called vst_variance_standardized:

    common_feature = np.intersect1d(ref_obj.var.sort_values(by=['vst_variance_standardized'],
                                                            ascending=False).index.values[:n_features],
                                    query_obj.var.sort_values(by=['vst_variance_standardized'],
                                                              ascending=False).index.values[:n_features])

    common_feature_selected = np.intersect1d(ref_obj.var.sort_values(by=['vst_variance_standardized'],
                                                                     ascending=False).index.values[:filter_n_features[0]],
                                             query_obj.var.sort_values(by=['vst_variance_standardized'],
                                                                       ascending=False).index.values[:filter_n_features[1]])

I can generate these variable genes using scanpy's sc.pp.highly_variable_genes(), but when I try to do this with the concatenated CODEX data, I get the following error:

>>> sc.pp.highly_variable_genes(adata=adata_concat, n_top_genes=5, flavor="seurat_v3", inplace=True, batch_key='csv_sample')
/home/seyediana/miniconda3/envs/GLUER/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py:62: UserWarning: `flavor='seurat_v3'` expects raw count data, but non-integers were found.
  warnings.warn(
/home/seyediana/miniconda3/envs/GLUER/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py:83: RuntimeWarning: invalid value encountered in log10
  x = np.log10(mean[not_const])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/seyediana/miniconda3/envs/GLUER/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 422, in highly_variable_genes
    return _highly_variable_genes_seurat_v3(
  File "/home/seyediana/miniconda3/envs/GLUER/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 85, in _highly_variable_genes_seurat_v3
    model.fit()
  File "_loess.pyx", line 899, in _loess.loess.fit
ValueError: b'Extrapolation not allowed with blending'

I get an identical error with the scRNA-seq data.

One line that stands out to me .../_highly_variable_genes.py:62: UserWarning:flavor='seurat_v3'expects raw count data, but non-integers were found.

This is because, for both of the datasets, adata.X contains normalized counts. There exists a adata.raw for both objects, but I have no idea how to use it. All of my attempts to use it have failed.

Do you have any suggestions?

scRNA cell spatial CODEX single seurat • 1.6k views

ADD COMMENT • link 2.3 years ago by Arta • 0

score 1 · Accepted Answer · 2022-08-24

Figured it out, I have to extract the sparse matrix like so:

adata_concat.layers['raw'] = pd.DataFrame.sparse.from_spmatrix(adata_concat.raw.X)

adata_concat.layers['raw'].columns = adata_concat.var.index
adata_concat.layers['raw'].index = adata_concat.obs.index

sc.pp.highly_variable_genes(adata=adata_concat, n_top_genes=5, layer = 'raw', flavor="seurat_v3", inplace=True, batch_key='csv_sample')