Question

How to normalize bulk and scRNA-seq data in order to compare them

0

Entering edit mode

2.9 years ago

confused but trying • 0

Hi everyone,

I have scRNA-seq from a tumor, and bulk RNA-seq from 3 cancer clones (each clone was grown up from a single cell in vitro). We have created a UMAP plot from the scRNA-seq, and found that it formed clusters with distinct transcriptional phenotypes. We want to see where the 3 cancer clones fall on this UMAP plot, but don't know how to normalize the data in order to project them onto the plot. Does anyone know any techniques to do this?

RNA-seq bulk scRNA-seq • 1.8k views

ADD COMMENT • link updated 2.9 years ago by James • 0 • written 2.9 years ago by confused but trying • 0

0

Entering edit mode

Instead of normalization, you can also try to use custom distances like cosine-distance, which implicitly normalizes data.

ADD REPLY • link 2.9 years ago by James • 0

score 2 · Answer 1 · 2022-05-09

I don't have a great idea for projecting them, but you could do ssGSEA or similar geneset scoring methods (GSVA) for the genes that define each of your distinct clusters. This would hopefully indicate which phenotype each of your clones is most similar to.

In reality, I guess you could also run SingleR or such with your single cell dataset as the reference to label each bulk sample and spit out some scores.

score 1 · Answer 2 · 2022-05-09

1

Entering edit mode

2.9 years ago

wiscoyogi ▴ 40

are you trying to co-cluster your single cell and bulk data, where the idea is that the distinct clones agree with distinct single cell subtypes?

what is your objective with 'seeing where the 3 cancer clones fall on this UMAP plot?' ... what does that mean?

naively, I would run differential expression on the single cell data and then compare the expression of those genes in the bulk data. I wouldn't integrate the bulk and sc data at face value given technical differences between single cell and bulk data (dropout in sc being just one of them...)

ADD COMMENT • link 2.9 years ago by wiscoyogi ▴ 40

0

Entering edit mode

Yes, exactly as you said, we are trying to co-cluster our sc and bulk data, with the idea that distinct clones agree with distinct single cell subtypes.

By 'seeing where the 3 cancer clones fall on this UMAP plot?', I mean adding each clone as a separate data point to the UMAP plot, and seeing what cluster each data point is in.

Yeah, I think you're right, it might not be possible to integrate the sc and bulk data. We'll probably have to do you DE approach or the ssGSEA idea listed above

ADD REPLY • link 2.9 years ago by confused but trying • 0

1

Entering edit mode

again, given the different natures of single cell and bulk RNA seq data I would not attempt to co-cluster.... I'd extract information from the single cell clusters and then use that to query/interact with your bulk data.

ADD REPLY • link 2.9 years ago by wiscoyogi ▴ 40