Question

How to choose cells that co-express several genes with scanpy?

0

Entering edit mode

5 months ago

bioinfo ▴ 150

Hello,

I have an adata object in scanpy but I want to keep only the cells that co-express 2 genes. The below works when specifying one gene:

Adata_sub= ad[(ad[: , 'GENE1'].X > 0), :]

However, when I try to specify 2 genes it fails and gives the following error.

Adata_sub[(ad[: , 'GENE1'].X >0) & (ad[: , 'GENE2'].X > 0), :]
    TypeError: unsupported operand type(s) for &: 'SparseCSRView' and 'SparseCSRView'

I am working with the normalized/scaled data and I think the issue is that after running those commands I end up with a sparse matrix of type '<class 'numpy.float32'>'. If I do the regress and scale steps after normalization the ad.X is not a sparse matrix anymore so I can use the command without getting the error. However, I want to subset based on the normalized data. How can I do that?

I have found the link below but there doesn't seem to be a solution https://github.com/scverse/scanpy/issues/1870

Thank you

scanpy scRNA-seq single-cell • 332 views

ADD COMMENT • link updated 5 months ago by LChart 4.7k • written 5 months ago by bioinfo ▴ 150

score 0 · Answer 1 · 2024-07-23

0

Entering edit mode

5 months ago

LChart 4.7k

Just hack it: ((ad[: , 'GENE1'].X > 0) + (ad[: , 'GENE2'].X > 0)) == 2

ADD COMMENT • link 5 months ago by LChart 4.7k