R: Heatmaps Log2 Scale Or Non Log Scale?
2
2
Entering edit mode
13.9 years ago
Timtico ▴ 330

Dear all,

I'm producing heatmaps in log2 scale (RMA). Now, when I produce heatmaps after transformation to non log scale, obviously my heatmaps look different, and in some cases differential expression is easier observed.. What arguments are there to stay in log scale?

r microarray heatmap • 23k views
ADD COMMENT
6
Entering edit mode
13.9 years ago
toni ★ 2.2k

Heatmaps on non-log2 scale tend to show a very saturated signal because of the large scale of numerical values. I personally do not enjoy those ones...

Nevertheless, on the log-scale you often obtain a global dark image because of the outliers present in the distribution of intensities. It's, again, not really appreciable.

What I would advise is to draw the heatmap on the log-scale.

But the trick is to modify the generation of your color scale. Do not make it linear with your ordered intensities. But skew it at both ends to diminish outliers effects.

At the time I was analyzing microarrays I wrote this code to modify the contrast of my heatmaps:

let 'mat'  be the matrix to plot

# For a heatmap from green to red (grid of 128 colors here)
colors <- colorRampPalette(c("green", "black", "red"), space="rgb")(128);

# Contrast parameters
ncols <- 18                          # Number of colors (out of 128) used  
                                     # for both ends.
                                     # end1 = lower whisker to min value
                                     # end2 = upper whisker to max value
min.z <- min(as.vector(mat));        # The min intensity in the matrix
max.z <- max(as.vector(mat));        # The max intensity in the matrix
b <- boxplot(as.vector(mat),plot=F); # Get the distribution
low.w <- b$stats[1,1];               # Lower whisker
up.w <- b$stats[5,1];                # Upper whisker

# Now create the non-linear color bar values
# breaks is a numerical vector of length 128
# each value corresponding to a color
# breaks will be given to 'image' function
breaks <- c(
seq(min.z,low.w,by=(low.w-min.z)/ncols),
seq(low.w,up.w,by=(up.w-low.w)/(128-2*ncols)),
seq(up.w,max.z,by=(max.z-up.w)/ncols)
);
breaks <- unique(round(breaks,4));
image(t(mat[nrow(mat):1,]),col=colors,breaks=breaks,axes=FALSE);

# colorbar (just to plot the color bar indexed by intensities)
par(omi=c(0,0.1,0.1,0.1),mar=c(4,0,0,0))
image(matrix(1:length(colors), ncol=1), col=colors, xaxt="n", yaxt="n");
axis(1, at = c(1,ncols,64,128-ncols,128)/128, labels =
     round(c(min.z,low.w,0,up.w,max.z),2),las=2)

You can play with ncols variable and put the breaks wherever you want to in your distribution.

You can also modify the total number of colors if you wish (here 128).

Hope this helps.

T.

ADD COMMENT
2
Entering edit mode

If interested readers want to cluster and mess around with the color scale without writing code, the Multiple Experiment Viewer application (http://www.tm4.org/mev/) makes it extremely easy to load data, cluster, and change the dynamic range or color choices. MEV does a boatload of other things as well.

ADD REPLY
0
Entering edit mode

image(t(heat[nrow(heat):1,]),col=colors,breaks=breaks,axes=FALSE);

Error in image.default (t (heat [nrow (heat): 1,]), col = colors, breaks = breaks ,: 'breaks' should be taken one more than 'colors' What does it mean?

ADD REPLY
2
Entering edit mode
13.9 years ago
Gww ★ 2.7k

I think the scale of your heatmap depends on the dynamic range of your data. If you have a huge dynamic range you may not be able to see differential expression very well. Especially if values with very high expression are "outliers" and the rest of the distribution is to the middle or lower end of the range.

ADD COMMENT
0
Entering edit mode

that's exactly what I was thinking about, thanks

ADD REPLY

Login before adding your answer.

Traffic: 1827 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6