How to choose cells that co-express several genes with scanpy?
1
0
Entering edit mode
4 months ago
bioinfo ▴ 150

Hello,

I have an adata object in scanpy but I want to keep only the cells that co-express 2 genes. The below works when specifying one gene:

Adata_sub= ad[(ad[: , 'GENE1'].X > 0), :]

However, when I try to specify 2 genes it fails and gives the following error.

Adata_sub[(ad[: , 'GENE1'].X >0) & (ad[: , 'GENE2'].X > 0), :]
    TypeError: unsupported operand type(s) for &: 'SparseCSRView' and 'SparseCSRView'

I am working with the normalized/scaled data and I think the issue is that after running those commands I end up with a sparse matrix of type '<class 'numpy.float32'>'. If I do the regress and scale steps after normalization the ad.X is not a sparse matrix anymore so I can use the command without getting the error. However, I want to subset based on the normalized data. How can I do that?

I have found the link below but there doesn't seem to be a solution https://github.com/scverse/scanpy/issues/1870

Thank you

scanpy scRNA-seq single-cell • 296 views
ADD COMMENT
0
Entering edit mode
4 months ago
LChart 4.5k

Just hack it: ((ad[: , 'GENE1'].X > 0) + (ad[: , 'GENE2'].X > 0)) == 2

ADD COMMENT

Login before adding your answer.

Traffic: 1635 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6