FastTree trifurcating Root node
0
0
Entering edit mode
6.0 years ago
Moses ▴ 150

Hi,

I'm using FastTree to construct phylogenetic tree for an application, This application requires a full binary tree however FastTree is generating a tree that's binary in all it's nodes other than the root node. Is there a way that I can change this option so that even the root will have two children only?

this is specified in the documentation: "the root will have three children and other internal nodes will have two children"

alignment FastTree Phylogenetic Tree • 4.1k views
ADD COMMENT
0
Entering edit mode

Can you provide your data?

Polytomies are more often the result of certain aspects of the data than of the specific tree algorithm.

Have you attempted to plot your tree with anything else? (IQTree is very good).

ADD REPLY
0
Entering edit mode

you can access my Multiple sequence alignment that I used to build the Tree with FastTree through this link: https://iu.box.com/s/ihequ56at14e2f9twb4j5vu71m6zvpeu

I have used Archaeopteryx tool to visualize it.

ADD REPLY
0
Entering edit mode

You can always save the tree in a txt based format (phylip, nexus) and then open and manually edit the tree in a text-editor. I used to have a software to visually edit trees but can't recall what it was for the moment.

ADD REPLY
0
Entering edit mode

To my knowledge, and I asked a few colleagues too, there's no way to force the bifurcation. Whether Lieven's suggestion of manually editing it is appropriate I'm not sure - you'd have to know a priori which the outgroup would be and that's not always going to be the case.

ADD REPLY
0
Entering edit mode

Well, it might indeed not be 'biologically' appropriate but it might solve the issue. That being said, the outgroup remark is certainly valid as jrj.healey mentions. Not sure about FastTree but can't you add an outgroup species when constructing the tree? Or go for an un-rooted tree?

ADD REPLY
0
Entering edit mode

Thanks all for your responses and suggestions. Unfortunately I will be constructing many trees in and calling FastTree in a pipeline so updating the output in newick format manually would not be that feasible. One way that I found it to work is to use the "retree" script that comes with the Phylip package, and then you can use the midpoint parameter which will center an arbitrary root to be the midpoint of the tree equidistant from the two furthest leaves. After this post-processing I end up having a full binary tree, if I have let's say n leaf nodes then I have n-1 internal nodes (including the "root" node).

The only issue with this is however these scripts were coded to run interactively with the user, if you want to run them in another script and call them by issuing one command then you have to write a physical file including all of it's parameters one line at a time, which is going to make the process more complicated. I need to find a more practical way to automate this post processing!

ADD REPLY
1
Entering edit mode

Look in to Dendropy or ete3. Both are python modules which have methods for midpoint rooting a tree.

Are you using fasttree for any reason other than speed?

How did you make your initial alignment by the way? I looked at it earlier and it seems to be very 'gappy'.

ADD REPLY
0
Entering edit mode

this would make life much easier. Thank you for this!

ADD REPLY
0
Entering edit mode

update! I tried the Dendropy method, gave the same results as the FastTree program however it takes much more time to execute!

ADD REPLY
0
Entering edit mode

Yes, I wasn’t suggesting you make your tree with it. I was just saying to use it to apply a midpoint root to the existing tree.

In terms of speed you probably won’t beat fasttree (hence the name). You might want to revisit your alignment process though as a bad alignment can slow down the tree creation process, and I’m not convinced your alignment is optimal.

ADD REPLY
0
Entering edit mode

what tool do you suggest to use for alignments? I used MUSCLE to generate that file which I shared the link!

Also I'm encountering another problem now that I have noticed: FastTree is not putting my entire sequence names from the fasta headers. For example if in the fasta header from my multiple sequence alignment is 'D08_k99_412321_1342_1701_+' then in the newick output file its becoming 'D08_k99_412321_1342',

or if the fasta header in my multiple sequence alignment is 'D13_k99_152816_1343_1711_+' then it becomes 'D13_k99_152816_1343' in the newick output file etc... I don't understand why FastTree is not keeping the entire sequence header, even-though there are no restricted characters in my fasta IDs and there are no spaces, just underscores which in the documentation is encouraged to use instead of space characters.

ADD REPLY
0
Entering edit mode

Many phylogenetic tools have a lenght limit on the sequence naming they allow (don't ask me why). Just make sure that the names stay unique

ADD REPLY
1
Entering edit mode

As lieven says, many phylo tools impose limits. I'm not sure what the limit is in the case of FastTree, but for example, PHYLIP strict format has a hard limit of 16 characters.

ADD REPLY
0
Entering edit mode

yes PHYLIP tools are constraining the names, I have to use a different tool since I need those sequence headers to appear fully in the newick file.

Concerning for my alignment, what tool should I use to align my sequences since you mentioned above that my alignments dont seem to be optimal?

ADD REPLY
1
Entering edit mode

Muscle is a solid choice so you can probably proceed as you are, but it would be scientifically rigorous to compare the alignments you get from other tools. MAFFT is also pretty speedy, so maybe give that a try and just see what you get.

ADD REPLY
0
Entering edit mode

Thank you for your directions. Ill also align it with MAFFT and see what I can get.

ADD REPLY

Login before adding your answer.

Traffic: 1975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6