There are several options to visualize and assign branch depth to the output of PhyloT, however the output needs to be generated using certain settings or it will not work.
Parse tree using R-package ape
, further use with ggtree
Choose example #1 in PhyloT. You need to set Internal nodes
to collapsed
, and Polytomy
to No
and export as Newick, this gives:
((((Gallus_gallus,(Homo_sapiens,Mus_musculus)Euarchontoglires)Amniota,Drosophila_melanogaster)Bilateria,Escherichia_coli)cellular_organisms);
which is wrong newick format. When you try to import this using read.tree, you will get:
> read.tree('~/perl/test.nwk')
Error in read.tree("test.nwk") :
The tree has apparently singleton node(s): cannot read tree file.
Reading Newick file aborted at tree no. 1
To avoid the error, edit the newick string and remove the outer parenthesis, such that is looks like this:
(((Gallus_gallus,(Homo_sapiens,Mus_musculus)Euarchontoglires)Amniota,Drosophila_melanogaster)Bilateria,Escherichia_coli)cellular_organisms;
You can now read the tree into R and set the edge length in the following way:
library(ggtree)
tree = read.tree('test.nwk')
ggtree(tree) + geom_tiplab(size=2) # plot tree without assigned lenghts
tree$edge.length <- rpois(nrow(tree$edge), 10) # assign random edge lengths
ggtree(tree) + geom_tiplab(size=2) # plot tree again
write.tree(tree) # output the modified tree
[1] "(((Gallus_gallus:9,(Homo_sapiens:16,Mus_musculus:10)Euarchontoglires:11)Amniota:8,Drosophila_melanogaster:8)Bilateria:10,Escherichia_coli:8)cellular_organisms;"
If you get the following error instead: Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1 && dim(obj[[i]]$edge)[1] > :
missing value where TRUE/FALSE needed
then you likely used Expanded Internal nodes.
Using BioPerl's Bio::TreeIO module
You can add a edge length by using the following simple demo script:
#!/usr/bin/env perl
use strict;
use warnings;
use Bio::TreeIO;
# read in a tree in newick format
my $treeio = Bio::TreeIO->new( -format => 'newick', -file => $ARGV[0] );
my $treeout = Bio::TreeIO->new( -format => 'newick' );
my $tree = $treeio->next_tree; # you might want to test that it was defined
my $rootnode = $tree->get_root_node;
# process just the next generation
my @nodes = $rootnode->get_all_Descendents();
map { $_->branch_length(1) } @nodes; # set constant branch length for all node
# for a more realistic scenario you have to parse a node length file and match with $node->id
$treeout->write_tree($tree);
print "\n";
Output:
$ ./tree_add_branchlength.pl test.nwk
(((Gallus_gallus:1,(Homo_sapiens:1,Mus_musculus:1)Euarchontoglires:1)Amniota:1,Drosophila_melanogaster:1)Bilateria:1,Escherichia_coli:1)cellular_organisms;
maybe https://guangchuangyu.github.io/ggtree/ can helps.
Thanks, this looks a very useful package.
Does it have a function to add pre-determined divergence times to a tree file?
PhyloT generated trees cannot be used with ggtree without adding the branch length first, because they cannot be imported. Indeed, I must take back my statement about phyloT being useful, because the generated trees cannot be opened in R, at least using package ape.
PhyloT is a nice tool for extracting trees from the taxonomy. You need to set branch lengths using some divergence timing information. You could try to get divergence time estimates from TimeTree, but while TimeTree might work well for higher level taxa, it might not have enough information on the species level. Maybe you could resort to making your own phylogeny like here: http://femsyr.oxfordjournals.org/content/15/6/fov050 ?
Thanks for your suggestion. Do you know if there is a way to add divergence times to the tree without having to edit the tree file (newick format in my case) in a text editor.
I think I would parse the tree with a perl script using Bio::TrioIO, then set each branch depth using Bio::Tree::NodeI function branch_depth() and print the tree again. as in the example http://search.cpan.org/~cjfields/BioPerl-1.6.924/Bio/Tree/NodeI.pm