Hello,
I am trying to display a heatmap using gplots heatmap.2 function, borrowing the data from: http://www.opiniomics.org/you-probably-dont-understand-heatmaps/
My problem is with regards to the scale="row" parameter. In theory, this is getting the raw data and performing a scaling (subtracting the row mean and dividing by the standard deviation). However, when I run it, I see that my Z-scores (color key) are beyond the -1 to +1 range:
library(gplots)
h1 <- c(10,20,10,20,10,20,10,20)
h2 <- c(20,10,20,10,20,10,20,10)
l1 <- c(1,3,1,3,1,3,1,3)
l2 <- c(3,1,3,1,3,1,3,1)
x <- rbind(h1,h2,l1,l2)
# Put samples as columns
x = t(x)
metric = "euclidean"
linkage_method = "average"
# First calculate the samples distance
# dist() calculates distances between rows, samples are on columns, therefore
# transpose the matrix
# Calculate the distance matrix between samples
samples_distance_matrix = dist(t(x), method=metric)
# Then calculate the distance between features, features are on rows
features_distance_matrix = dist(x, method=metric)
# Now produce the heatmap and dendrograms
# The rows and right will contain features
# Apply hclust with the linkage method specified
samples_dend = as.dendrogram(hclust(samples_distance_matrix, method=linkage_method))
features_dend = as.dendrogram(hclust(features_distance_matrix, method=linkage_method))
heatmap.2(x, Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", scale="row", col=colorRampPalette(c("white","darkblue")))
When I input manual scaling of the data and remove scale="row" I get the expected Z-scores between -1 and +1:
heatmap.2(scale(x, center = TRUE, scale = TRUE), Rowv = features_dend, Colv = samples_dend, trace="none", margins=c(8,5), labRow = NA, dendrogram="column", col=colorRampPalette(c("white","darkblue")))
I thought that heatmap.2 was just presenting the raw data after scaling rows, is something else going on? I have other data going from -3 to 3.5 in Z-scores.
Thanks in advance!
I take it you don't know what a z-score is otherwise, could you please clarify why it should be between -1 and 1 ?
Here's the code from the heatmap.2 function:
So, yes, extracting the mean and then dividing by the standard deviation of the mean-subtracted data. As Jean-Karim mentions, this is Z-score scaling.
Generate random data and try it out:
Sorry, silly mistake, as you say, I wasn't understanding Z-scores correctly.
So there are two issues here: one is that the Z-scores can go beyond -1 and +1 so there isn't a problem. The other one is that I made a mistake in my code since the "scale" function scales by columns and not rows, therefore scaling the transpose of the input matrix and then transposing gets the same answer:
Scale should be either “none”, “row” or “column”