Question

Can I enrich a gene expression signature derived from single cell RNA-Seq in a bulk RNA-Seq data set?

2

Entering edit mode

4.9 years ago

miherbert ▴ 20

Dear all,

my supervisor recently asked me to enrich a gene expression signature from a published scRNA-Seq data set in my bulk RNA-Seq dataset. I have the list of DE genes with logFC and p value from that single cell paper in an excel table. Is there anything I need to consider when comparing single cell RNA-Seq with bulk RNA-Seq? Which package/function would you use for this task (I work in R)? I have been looking at fgsea and clusterprofiler, but cannot decide which one would be advantagous. Any other suggestings?

Thank you very much for your help!

P.S. My supervisor is not a bioinformatician and cannot help me with this one.

scRNA-Seq RNASeq GSEA gene expression signature • 2.6k views

ADD COMMENT • link updated 4.9 years ago by GouthamAtla 12k • written 4.9 years ago by miherbert ▴ 20

score 4 · Answer 1 · 2020-07-23

4

Entering edit mode

4.9 years ago

GouthamAtla 12k

There are several methods to de-convolute bulk RNASeq using scRNA data to infer proportion of underlying cell-types in bulk data. Few methods are listed below.

MuSiC

bisque

CIBERSORTx

Either you can provide marker genes or scRNA raw expression matrix to estimate underlying cell-type proportions.

ADD COMMENT • link 4.9 years ago by GouthamAtla 12k

0

Entering edit mode

Thank you for your comment. Unfortunately, the purpose is not to find underlying cell type proportions, but rather to see if the disease status is kind of similar.

The mentioned resources seem very helpful for the purpose that you mentioned, however not for my problem.

ADD REPLY • link 4.9 years ago by miherbert ▴ 20

1

Entering edit mode

Its not clear from your question that you are talking about disease signatures. In that case, unless you have some "control" data set to compare against, its hard to show the enrichment of disease signature genes.

Imagine your disease signature genes as a "gene set" like any other gene set (like a KEGG pathway genes), if you need to show enrichment in your bulk samples, either you need to have a "control" data set for your bulk samples to show the genes from your gene set show higher expression or some sort of similar metric.

Or you need to have a control gene set, showing that the control gene sets show lower expression than the disease status gene sets in your bulk rna. For example, DE genes from scRNA shows higher expression in your bulk data set than the non-DE genes might show disease signature genes are activated.

ADD REPLY • link 4.9 years ago by GouthamAtla 12k