How can I identify cells from Single cell experiment (SCE) using bulk RNA experiment (RNAseq) genes signatures?
3
0
Entering edit mode
5.3 years ago

Hello Everyone,

I have two experiments: single cell (SCE) and RNAseq (RNA). Each of these experiments have two groups: control (ctr) and treated (trt).

Experiment 1: I've used Seurat pipeline to perform clustering and cell types identification of SCE.

Experiment 2: I have used DESeq2 to identify significant genes (up/down, say n=50) between trt vs ctr from RNAseq.

Now I want to use these 50 significant genes (RNAseq, bulk RNA) to identify which cells from SCE follow a similar expression signature. If these 50 genes would be all up or down, then I could take the average through which I can compare the average of each cell from SCE. However, I herein can't take the mean as some of the genes are down-regulated, considering mean of positive and negative values (fold-changes of significant genes) is not a good parameter to define the overall expression of RNAseq experiment as negative and positive values cancel each other.

What would be the best parameter to represent (positive and negative values)?

I would greatly appreciate any feedback, hint or suggestion.

Thank you,
Sofia

RNA-Seq next-gen • 2.2k views
ADD COMMENT
1
Entering edit mode
5.3 years ago

Ahh sorry I misunderstood your original question - I'll add a new answer. A common way to solve that problem is to generate a signature from each of your bulk conditions (say to 50-100 most significantly up- an down-regulated genes (which each would be a signature of the cell they are higher expressed in)) and then use a single sample scoring tool such as ssgsea from the gsva R package. Using that you will get a score for each of the two signatures in each of the single cell and you simply classify each cell as the type with the highest score.

ADD COMMENT
0
Entering edit mode

Thank you for your advise and link. Greatly appreciated!

ADD REPLY
1
Entering edit mode
5.3 years ago

There is a new version of SingleR(see vignette for examples) under development that could handle this quite easily. All it needs is your DESeq2 count matrix, SCE count matrix, and your marker gene list. It will assign identities to each single cell based on the expression profiles in your training set (your DESeq2 counts).

ADD COMMENT
0
Entering edit mode

Thank you for your advise and link. Greatly appreciated!

ADD REPLY
0
Entering edit mode
5.3 years ago

A solution could be to correlate the log2FC in the DE bulk genes to the corresponding log2FC in the single cell data. That should take into account both the size and the sign of the values.

ADD COMMENT
0
Entering edit mode

Thank you for your suggestion. However, I don't want to do a overall comparison between SCE (trt/ctr; log2FC) vs log2FC (RNAseq).

The main goal herein is to classify each and every cell from the SCE using the RNA-seq (significant genes signature). Therefore, either a cell will follow the signature (say score = 1) or not follow the RNA-seq signature (score = -1).

I can estimate the log2FC from RNA-seq (trt/ctr). On the other hand, I've raw, normalized and scaled data sets from the SCE. How can I estimate the log2FC for each cell, considering cells labels are unknown?

Again, thank you for your time and suggestion. Any further suggestions will be greatly appreciated.

Thanks, Sofia

ADD REPLY

Login before adding your answer.

Traffic: 1665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6