gplots heatmap.2 scale function not generating Z-scores between -1 to +1
1
0
Entering edit mode
6.6 years ago

Hello,

I am trying to display a heatmap using gplots heatmap.2 function, borrowing the data from: http://www.opiniomics.org/you-probably-dont-understand-heatmaps/

My problem is with regards to the scale="row" parameter. In theory, this is getting the raw data and performing a scaling (subtracting the row mean and dividing by the standard deviation). However, when I run it, I see that my Z-scores (color key) are beyond the -1 to +1 range:

library(gplots)

h1 <- c(10,20,10,20,10,20,10,20)
h2 <- c(20,10,20,10,20,10,20,10)

l1 <- c(1,3,1,3,1,3,1,3)
l2 <- c(3,1,3,1,3,1,3,1)

x <- rbind(h1,h2,l1,l2)

# Put samples as columns
x = t(x)

metric = "euclidean"
linkage_method = "average"


# First calculate the samples distance

# dist() calculates distances between rows, samples are on columns, therefore
# transpose the matrix
# Calculate the distance matrix between samples
samples_distance_matrix = dist(t(x), method=metric)


# Then calculate the distance between features, features are on rows
features_distance_matrix = dist(x, method=metric)


# Now produce the heatmap and dendrograms
# The rows and right will contain features
# Apply hclust with the linkage method specified
samples_dend = as.dendrogram(hclust(samples_distance_matrix, method=linkage_method))
features_dend = as.dendrogram(hclust(features_distance_matrix, method=linkage_method))

heatmap.2(x, Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))

Heatmap.2 scaling

When I input manual scaling of the data and remove scale="row" I get the expected Z-scores between -1 and +1:

heatmap.2(scale(x, center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", col=colorRampPalette(c("white","darkblue")))

Manual scaling

I thought that heatmap.2 was just presenting the raw data after scaling rows, is something else going on? I have other data going from -3 to 3.5 in Z-scores.

Thanks in advance!

heatmap.2 gplots r scale • 8.6k views
ADD COMMENT
1
Entering edit mode

I take it you don't know what a z-score is otherwise, could you please clarify why it should be between -1 and 1 ?

ADD REPLY
0
Entering edit mode

Here's the code from the heatmap.2 function:

if (scale == "row") {
+         retval$rowMeans <- rm <- rowMeans(x, na.rm = na.rm)
+         x <- sweep(x, 1, rm)
+         retval$rowSDs <- sx <- apply(x, 1, sd, na.rm = na.rm)
+         x <- sweep(x, 1, sx, "/")
+     }

So, yes, extracting the mean and then dividing by the standard deviation of the mean-subtracted data. As Jean-Karim mentions, this is Z-score scaling.

Generate random data and try it out:

random <- matrix(rexp(200, rate=.1), ncol=20)
rm <- rowMeans(random, na.rm=TRUE)
x <- sweep(x, 1, rm)
sx <- apply(x, 1, sd, na.rm=TRUE)
x <- sweep(x, 1, sx, "/")
range(x)
[1] -13.898808  -3.451125
ADD REPLY
1
Entering edit mode

Sorry, silly mistake, as you say, I wasn't understanding Z-scores correctly.

So there are two issues here: one is that the Z-scores can go beyond -1 and +1 so there isn't a problem. The other one is that I made a mistake in my code since the "scale" function scales by columns and not rows, therefore scaling the transpose of the input matrix and then transposing gets the same answer:

heatmap.2(t(scale(t(x), center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))
ADD REPLY
0
Entering edit mode

Scale should be either “none”, “row” or “column”

ADD REPLY
2
Entering edit mode
6.6 years ago

Sorry, silly mistake, as you say, I wasn't understanding Z-scores correctly.

So there are two issues here: one is that the Z-scores can go beyond -1 and +1 so there isn't a problem. The other one is that I made a mistake in my code since the "scale" function scales by columns and not rows, therefore scaling the transpose of the input matrix and then transposing gets the same answer:

heatmap.2(t(scale(t(x), center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))
ADD COMMENT

Login before adding your answer.

Traffic: 1924 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6