Using ML tree or bootstrapped tree
1
1
Entering edit mode
9.5 years ago
GR ▴ 400

Hi Everyone,

I am new to phylogenetics and using RAXML to generate my phylogenetic trees. I did bootstrapping (1000) to get the support values on the best ML tree. I get two output files: one is the ML best tree and the other one contains the bootstrap values. Which tree I should use as an output? They both look pretty similar.

Thanks, RT

phylogenetics • 5.5k views
ADD COMMENT
0
Entering edit mode

I am a Biostar new user. Although it is late, I am posting my answer assuming other users might benefit from this post. I always use the ML tree with the bootstrap values, as those values are important to assess the clade supports in phylogenetic analyses.

ADD REPLY
2
Entering edit mode
9.5 years ago
Brice Sarver ★ 3.8k

They will be identical, except the bootstrapped tree will have support values (generated through bootstrap resampling).

For systematic inference, a tree without support values is more-or-less meaningless, as you might have bipartitions that are not well-supported yet still present in your consensus tree.

For comparative phylogenetic methods, you ought to 1) use the best-supported tree or 2) use a distribution of trees.

ADD COMMENT
0
Entering edit mode

HI Brice, What do you mean by distribution of trees? for info, I need to use these trees for codeml analysis and guess it will be ok to go ahead with either (removing both branch lengths and bootstrap values). I simply need the toplogy of the tree.

ADD REPLY
2
Entering edit mode

Some diversification rate analyses can quantify uncertainty using a posterior distribution of trees - basically calculating parameters of interest across a bunch of trees to get 95% HPDs of the parameters. For the analyses in PAML, you can either 1) fix a tree, 2) specify a starting tree, or 3) let PAML figure it out on its own. In this case, you want to supply the best tree you can, either the ML tree or a tree generated using an additional approach.

ADD REPLY
0
Entering edit mode

Hi Brice, Thanks a lot for your help. Probably the last question as you seem to knwo codeml well. Do I need to choose an outgroup from my trees for running codeml branch site models and site-specific models? Also some of my target genes are outside the tree. Will results for branch site analysis be valid for these genes?

ADD REPLY
1
Entering edit mode
  1. Your selection analyses will have more power if your dataset has fixed differences among lineages as opposed to currently segregating polymorphisms. In this context, it might be worthwhile to include another lineage if you have the data. If your tree is rooted, with or without a true outgroup, codeml assumes that your tree follows a molecular clock. Remember that codeml is expecting branch lengths in units of substitutions per codon, not substitutions per site.
  2. Any branch analysis will only be accurate if the history of the other genes is the same as the ones you are testing. Phylogenetic discordance among genes is a real issue, so you ought to perform an analysis using the history of that gene. I can imagine situations where you might consider fixing the tree to the species tree, the representation of the true phylogenetic history of those lineages.
ADD REPLY

Login before adding your answer.

Traffic: 1693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6