Add metadata based on tree structure
Entering edit mode
5.4 years ago
Jack ▴ 50


I just started working with ggtree to visualise phylogenetic trees and so far it works very well.

Now I would like to add some meta data (tab-, csv- or excel-file) based on tree structure.

Ideally it would look similiar to this example: enter image description here

But instead of the SNP and Trait data, I would like to use simple text (tab-, csv- or excel-file).

[Tree//header] [metadata1//header] [metadata2//header] [...//header]

ID1 MetadataForID1 moreMetadataForID1 ...

Is there maybe a simple way to add this kind of data?

Thank you!

R ggtree • 6.1k views
Entering edit mode

Thank you for your answers!

With the help of the following code I am able to add meta data, that is saved in a .txt-file ('d1'):

ggtree(tree) %<+% d1 + geom_tiplab() + geom_tiplab(aes(label = factor(sample)), offset=0.1) + theme_tree2() + geom_tiplab(aes(label = factor(metadata)), offset=0.4) + xlim(NA, 4)

enter image description here

After aligning the data, I can not remove the lines:

ggtree(tree) %<+% d1 + geom_tiplab(align=TRUE, linesize=.5) + geom_tiplab(aes(label = factor(sample)), offset=0.1, align=TRUE) + theme_tree2() + geom_tiplab(aes(label = factor(metadata)), offset=0.4, align=TRUE) + xlim(NA, 4)

enter image description here

Is there a way to remove the lines for 'sample' and 'metadata'? And is it possible to add a header with a grey background, just like the one in my first post, for 'Sample' and 'metadata'?

Entering edit mode
5.4 years ago
Guangchuang Yu ★ 2.6k

The first column should be node numbers or labels for ggtree to link your data to the tree.

Please refer to our paper, There are a number of examples in the supplemental file.

Entering edit mode
5.4 years ago
thackl ★ 3.0k

If you are interested in some flexibility beyond ggtree's nice built-in functions, check out my blog post about an alternative approach:

Entering edit mode

Thanks for the links!

But unfortunately I could not find any example with plain text.

All the examples use only geom_tiplab() for text and add data to that. So for example SNP or trait data, matrix, sequence alignment, plot, ... I could not find any example with two or three columns of text

So, all I have is a text file, that looks like this:

label sample metadata

t1 Sample_1 metadata_1

t2 Sample_2 metadata_2

t3 Sample_3 metadata_3

t4 Sample_4 metadata_4

t5 Sample_5 metadata_5

No data and no x values.

So right now I have no clue how to realise this.

Thats why I painted a picture of the output, that I would like to achieve: enter image description here

Has anyone an idea, how to implement this?

Entering edit mode
1 day ago
rsieber ▴ 10

I know this is an old post, but it seems to me that the question is still relevant.
I have achieved some table like structure using the geom_tiplab function, but there must be a much better and easier way to do this. Does anyone have a suggestion how?
Here's my code and the result:


## Create a random tree
tree <- rtree(20)

## And some random metadata
df <- tibble(
  tip_name = tree$tip.label,
  long_and_disturbing_colname = paste(tree$tip.label, 'meta1', sep="_"),
  other_colname = paste(tree$tip.label, 'long_long_meta2', sep="_"),
  other_colname2 = paste(tree$tip.label, 'meta3', sep="_")

## Add table data
tab_cols <- c(
  "Meta1" = "long_and_disturbing_colname",
  "Meta2" = "other_colname",
  "Meta3" = "other_colname2"

table_tib <- tibble(
  rel_width=c(1, 2, 1)

## And and also gather some plotting parameters for this in a vector
tabv <- data.frame(
  offs_s = 1,
  ann_yjust = 1.0, ann_xjust = 0.000,
  ann_vjust = 0.5, ann_hjust = 0,
  ann_size = 3,
  xdiff = 1,
  size = 2.5

## add some info to dataframe from above
table_tib$width <- as.numeric(table_tib$rel_width)*tabv$xdiff
table_tib$x_start <- cumsum(c(0, table_tib$width[1:(nrow(table_tib)-1)]))

## Plot the tree
p <- ggtree(tree, layout="rectangular") %<+% df
q <- p + geom_tiplab(aes(label= label), size=2.6,linesize=0.02, linetype="dashed", geom="text",
              align=F, offset = 0.1) # Tip labels
## add tip labels are added for each column
for(i in 1:nrow(table_tib)) {
  table_row = table_tib %>% slice(i)
  q <- q + geom_tiplab(aes(subset=isTip,[[table_row %>% pull("name")]]), size=tabv$size, geom="text", align=T, linesize=0.05, linetype=NA, show.legend = F, offset=tabv$offs_s + table_row %>% pull("x_start")) +
    annotate(geom="text", x=max(p$data$x) + tabv$offs_s + table_row %>% pull("x_start") + tabv$ann_xjust, y = max(p$data$y) + tabv$ann_yjust,
             label=table_row %>% pull("ring_label"), size=tabv$ann_size, angle=0, vjust=tabv$ann_vjust, hjust=tabv$ann_hjust)
## Plot with xlim to avoid cutting the last column
q + xlim(NA,10)

And here is the result: Random tree with columns of text metadata

So the desired features are:

  • pure text in form of table linked by (sorted by) the tip labels matching the first column of the tibble
  • adjustable text size, offset etc. parameters
  • (custom) column titles

Login before adding your answer.

Traffic: 2398 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6