Entering edit mode
9.1 years ago
ramiro.barrantes
•
0
I am working trying to decipher two large phylogenies and I am wondering about how to estimate confidence. In the past I had estimated confidence using boostrap and posterior probability when doing bayesian phylogenetic inference. Now I am following up on the work but the databases have grown immensely to 15000-30000 sequences. What is the approach for having some kind of confidence on the phylogeny?:
- Bootstrap or posterior probabilities on big phylogenes, is it possible or sensible?? My concern is that it seems that with bootstraps it just takes a single - very divergent - sequence "jumping around" to invalidate the bootstraps.
- reducing the phylogeny to a small enough set of sequences that would make it manageable? If we are mostly interested in the deeper branches does this make sense?
Thanks in advance, any help appreciated
Yes I know, it's just that my experience has been that it's very "sensitive to outliers": It just takes one sequence that is divergent enough so that it changes placement with every bootstrap set, and this adds noise. With 15000 sequences, this "trouble sequences" can be hard to find. Do you know the situation that I am talking about?? It happened to me with a single viral sequence among all the other eukaryotic/bacterial sequences when I did this years ago. When it was present, the bootstrap values on the deeper branches were very low, but his was an artifact of the viral sequence being placed differently on different bootstrap alignments due to it being so different.