Gene co-expression in single-cell RNA-seq

0

Entering edit mode

3.3 years ago

randalljellis ▴ 90

I am using single-cell RNA-seq data from the Allen Institute, and I want to look at gene co-expression in different cell populations. They provide raw UMI counts, so I'm wondering what normalization method to use (e.g., CPM, TPM, VST) to look at these correlations. Any rationale/justification is appreciated.

single-cell scRNA-seq RNA-seq correlation co-expression • 1.4k views

ADD COMMENT • link updated 3.3 years ago by rpolicastro 13k • written 3.3 years ago by randalljellis ▴ 90

1

Entering edit mode

Using scTransform is not a bad idea; it normalizes for sequencing depth and does a VST transform.

You can simply take the log of the CPMs -- but there are some problems with it (see the scTransform paper).

I wouldn't use TPMs -- UMIs generally shouldn't exhibit length biases (i.e. longer genes = more counts) that require TPM correction.

ADD REPLY • link 3.3 years ago by dsull ★ 6.9k

2

Entering edit mode

The authors of Seurat now recommend not to use the SCTransformed normalized counts outside of integration and dimension reduction. Instead they recommend using NormalizeData.

ADD REPLY • link 3.3 years ago by rpolicastro 13k

0

Entering edit mode

Thank you. I have another question. If I want to compare correlations between populations (Ex. Compare the correlation of Gene1 and Gene2 in Population 1 with G1/G2 in P2), should I normalize each population separately, or together?

ADD REPLY • link 3.3 years ago by randalljellis ▴ 90

0

Entering edit mode

If they're cell populations from the same sequenced sample, I'd normalize them together (see https://github.com/ChristophH/sctransform/issues/55 )

ADD REPLY • link 3.3 years ago by dsull ★ 6.9k

Login before adding your answer.