How to do DE between cells expressing specific gene in scanpy?
1
0
Entering edit mode
9 months ago
bioinfo ▴ 150

Hello,

I would like to do differential expression between cells expressing a specific gene and the ones not expressing it on scanpy. I cannot figure out how to do it. I know I can filter for the gene like this but that just removes all the cells not expressing it.

expressing_cells = adata[adata[: , 'gene'].X > 0, :] 

Does anyone know how to do this?

Thank you

scRNA-seq single-cell scanpy • 787 views
ADD COMMENT
0
Entering edit mode
9 months ago
Pratik ★ 1.1k

My terminology for python and scanPy may be a little off here, but I think you could try maybe adding an observation annotation based on whether the gene is expressed or not expressed and then do differential gene expression using groupby. This is with a huge assist from ChatGPT 3.5 but check it out:

This link served as an inspiration for question to ask ChatGPT:

https://nbisweden.github.io/workshop-scRNAseq/labs/scanpy/scanpy_05_dge.html#meta-dge_cond

From ChatGPT 3.5:

Certainly! Here's the whole script incorporating the steps to add the custom observation and perform differential gene expression analysis:

import scanpy as sc

# Assuming you have already loaded your AnnData object, which contains gene expression data
adata = ...  # Load or create your AnnData object

# 1. Define the gene of interest
gene_of_interest = 'your_gene'  # Specify the gene of interest

# 2. Determine expression status of the gene for each cell
# For demonstration purposes, let's assume your AnnData object contains gene expression in the .X attribute
is_expressed = adata[:, gene_of_interest].X > 0

# 3. Convert the expression status to custom strings
expression_status = ['cells_where_gene_is_expressed' if expr else 'cells_where_gene_is_NOT_expressed' for expr in is_expressed]

# 4. Add the custom observation to the AnnData object
adata.obs['gene_expression_status'] = expression_status

# 5. Perform differential gene expression analysis based on the custom observation
sc.tl.rank_genes_groups(adata, groupby='gene_expression_status', method='t-test')

# Access the results of differential gene expression analysis
results = adata.uns['rank_genes_groups']

# Now you can access the differentially expressed genes for each group
# results['names'] contains the names of differentially expressed genes
# results['logfoldchanges'] contains the log fold changes
# results['pvals'] contains the p-values, etc.

Replace 'your_gene' with the gene you are interested in. Adjust the observation name 'gene_expression_status' as needed. After running this script, you'll have performed differential gene expression analysis based on the custom observation 'gene_expression_status', comparing the gene expression profiles between cells where the gene is expressed and cells where it is not expressed.

This is me now: (You can also change the method='t-test' to other tests.)

ADD COMMENT

Login before adding your answer.

Traffic: 1827 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6