Tutorial:Heatmaps in R
0
73
Entering edit mode
8.3 years ago

As heatmaps in R are a recurring theme, I thought I'd collect information here:

1. List of heatmap functions (and packages) in R:

2. Tutorials:

3. Biostars posts:

EDIT 2017-02-03: Added more heatmaps. Thanks to people contributing below.
EDIT 2017-03-14: Added another one.
EDIT 2019-05-28: Added another one.
EDIT 2019-09-06: Added tutorial for EnrichedHeatmap (thanks to ATpoint)
EDIT 2020-08-14: Update: GMD and d3heatmap not on CRAN.
EDIT 2022-05-04: Update: heatmap.plus not on CRAN.

heatmap R • 36k views
ADD COMMENT
12
Entering edit mode

We need to modify this xkcd strip to replace "standards" with "heatmap functions in R".

enter image description here

ADD REPLY
2
Entering edit mode

...or "pipeline frameworks written in python." :) I find it is interesting there are so many tools/packages in the same application space, especially when they are implemented almost the same way.

ADD REPLY
9
Entering edit mode

Heatmaps in R are a curious beast as they do a lot more than "just" draw the heatmap.

They promote that endemic and flawed ideology that is very common of R based software. Instead of breaking down the steps into their logical units that one can easily build upon:

  1. compute distances
  2. perform a clustering
  3. scale the data (this can be an earlier step actually and that could also matter)
  4. draw the heatmap

They give you one "convenient" function with lots of parameters - but all that leads to is a misleading simplicity. It is really not clear what takes place and whether those steps are universally applicable. Hence the many options - neither of which takes on what the real problem is - overly tightly coupled concepts.

The majority of the people that use heatmaps probably do not fully understand how these heatmaps were created and that there are essential data processing steps involved that fundamentally alter what the heatmap will look like.

ADD REPLY
0
Entering edit mode

This reminds me of the "Quilt plot" paper a few years ago. This paper received a lot of hate but it proposed a much simpler implementation of the heatmap function in R (without clustering). I think such a paper would never have existed if the base R function was more intuitive/simple.

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

nice

ADD REPLY
1
Entering edit mode

Someone mentioned to me in the office that the list is missing d3heatmap and superheat.

ADD REPLY
1
Entering edit mode

Don't forget aheatmap from the NMF package. I also often roll my own with image() and layout(), and it is TERRIBLE!

ADD REPLY
0
Entering edit mode

I do it with ggplot2 and it is awful, but I like the flexibility it gives me with the layout.

ADD REPLY
0
Entering edit mode

Also add corrplot for correlation heatmaps.

Tutorials are here,

https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html

ADD REPLY
0
Entering edit mode

I just wrote up a post for heatmap, check here http://rpubs.com/crazyhottommy/a-tale-of-two-heatmap-functions

ADD REPLY
0
Entering edit mode

Hi Jean-Karim Heriche,

I haven't tried it (yet*) but this one looks pretty: Superheat: An R package for creating beautiful and extendable heatmaps for visualizing complex data. Also on github

*PhD-student-slang for "I would certainly like to but never will"

ADD REPLY
0
Entering edit mode

Hi Jean-Karim Heriche, I am wondering how can I make cluster heatmap, where I can assign samples to per-defiend clusters? thanks!

ADD REPLY
0
Entering edit mode

Just reorder the matrix as per your clustering. If you're working with hierarchical clustering, you can do something like this with base R:

tree <- hclust(dist(data.matrix, method = ...), method = ...)
data.matrix.reordered <- data.matrix[tree$order, tree$order]
heatmap(data.matrix.reordered, Rowv= as.dendrogram(tree), Colv=as.dendrogram(tree))

Check also the various heatmap packages. For example pheatmap and ComplexHeatmap allow to split a heatmap based on clusters.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Missing heatmap3 (github, CRAN).

Useful tutorial at http://compbio.ucsd.edu/making-heat-maps-r/ that covers and compares heatmap, heatmap.2, aheatmap, pheatmap, heatmap3, and annHeatmap2 (heatplus). It also carefully points out the difference between heatmap3 and heatmap.3, and notes that the latter is associated not just with the GMD package as listed here but also several other implementations that collide on the name.

ADD REPLY
0
Entering edit mode

I have developed a new heatmap package, ggalign, built upon ggplot2, which enables the creation of heatmaps as complex as those in ComplexHeatmap, while adhering to the grammar of graphics.

ADD REPLY
0
Entering edit mode

in the documentation you should start with an example of what problem the new software solves.

like before and after images,

otherwise people can't quite tell what does it mean when you say:

consistent axis alignment across multiple plots

ADD REPLY
2
Entering edit mode

Thank you for your suggestions! I will include a more detailed description in the README. For your convenience, I'll provide a few minor examples here

ggalign extends ggplot2 by providing advanced tools for aligning and organizing multiple plots, particularly those that automatically reorder observations, such as dendrogram. It offers fine control over layout adjustment and plot annotations, enabling you to create complex, publication-quality visualizations while still using the familiar grammar of ggplot2.

Why use ggalign?

ggalign focuses on aligning observations across multiple plots. It leverages the "number of observations" in the vctrs package or NROW() function to maintain consistency in plot organization.

If you've ever struggled with aligning plots with self-contained ordering (like dendrogram), or applying consistent grouping or ordering across multiple plots (e.g., with k-means clustering), ggalign is designed to make this easier. The package integrates seamlessly with ggplot2, providing the flexibility to use its geoms, scales, and other components for complex visualizations.

Getting Started

The usage of ggalign is simple if you're familiar with ggplot2 syntax, ggalign works with a simple workflow:

  • Initialize the layout using ggheatmap() or ggstack().
  • Customize the layout with:
    • align_group(): Group layout axis into panel with a group variable.
    • align_kmeans(): Group layout axis into panel by kmeans
    • align_reorder(): Reorder layout observations based on statistical weights or allows for manual reordering based on user-defined criteria.
    • align_dendro(): Reorder or Group layout based on hierarchical clustering
  • Adding plots with ggalign() or ggpanel(), then add ggplot2 elements like geoms, stats, scales.

Basic example

Below, we'll walk through a basic example of using ggalign to create a heatmap with a dendrogram.

library(ggalign)
set.seed(123)
small_mat <- matrix(rnorm(81), nrow = 9)
rownames(small_mat) <- paste0("row", seq_len(nrow(small_mat)))
colnames(small_mat) <- paste0("column", seq_len(ncol(small_mat)))

# initialize the heatmap layout, we can regard it as a normal ggplot object
ggheatmap(small_mat) + 
    # we can directly modify geoms, scales and other ggplot2 components
    scale_fill_viridis_c() +
    # add annotation in the top
    hmanno("top") +
    # in the top annotation, we add a dendrogram, and split observations into 3 groups
    align_dendro(aes(color = branch), k = 3) +
    # in the dendrogram we add a point geom
    geom_point(aes(color = branch, y = y)) +
    # change color mapping for the dendrogram
    scale_color_brewer(palette = "Dark2")

enter image description here

Compare with other ggplot2 heatmap extension

The main advantage of ggalign over other extensions like ggheatmap is its full compatibility with the ggplot2 grammar. You can seamlessly use any ggplot2 geoms, stats, and scales to build complex layouts, including multiple heatmaps arranged vertically or horizontally.

Compare with ComplexHeatmap

Pros

  • Full integration with the ggplot2 ecosystem.
  • Heatmap annotation axes and legends are automatically generated.
  • Dendrogram can be easily customized and colored.
  • Flexible control over plot size and spacing.
  • Can easily align with other ggplot2 plots by panel area.

Cons

Fewer Built-In Annotations: May require additional coding for specific annotations or customization compared to the extensive built-in annotation function in ComplexHeatmap.

More Complex Examples

The package's seamless integration with ggplot2 allows for the creation of more complex heatmaps and visualizations using ggplot2 syntax: enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 2235 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6