Question

rna-seq heatmap z-score calculation - TPM?

0

Entering edit mode

3 months ago

lyc1ne • 0

I'm wondering if visualizing heatmaps with z-scores for TPM is an appropriate approach?

From what I've read on here, people typically work with log normalized raw counts. However, I've already done a number of heatmap visualizations z-score scaling TPM from star-salmon. Is this approach invalid? I'd like to avoid re-doing this work if not necessary. From my own checks the values seem similar to raw count z-scores.

Thanks in advance for any help!

heatmap rnaseq • 544 views

ADD COMMENT • link updated 3 months ago by ATpoint 86k • written 3 months ago by lyc1ne • 0

score 0 · Answer 1 · 2024-08-26

0

Entering edit mode

3 months ago

Kevin Blighe 88k

For visualisation purposes, it's okay to Z-scale TPM values. The idea is that, by scaling, it improves the visual interpretation, i.e., to the reader / human eye. Note that which ever heatmap function that you are using may already scale by default; so, please check.

The statistical inferences that you make on the data for the purposes of differential expression obviously would not have been derived from the Z-scale TPM values. I assume that you calculated these via a standard tool, like EdgeR, limma-voom, or DESeq2.

Kevin

ADD COMMENT • link 3 months ago by Kevin Blighe 88k

0

Entering edit mode

Amazing, thank you for the quick response! Yes, have done statistical inferences using standard RNA-seq tools like the ones you've listed.

ADD REPLY • link 3 months ago by lyc1ne • 0

score 0 · Answer 2 · 2024-08-26

Scaling is appropriate and makes sense in most scenarios. In RNA-seq we are typically interested in emphasizing differences between groups. A heatmap without scaling though will emphasize absolute expression levels. That is not a meaningful choice for both clustering and visualization. You can find details here in this older post of mine: Scaling RNA-Seq data before clustering?