I am currently using the excellent R packages 'Phangorn' and 'ape' to do some parsimony-based phylogenetic analysis with the 'pratchet' fuction (parsimony ratchet) and have encountered a bit of a problem with nodes that I feel should probably be collapsed into polytomies.
At current the bootstrapped trees I am producing contain several nodes with bootstrap supports of either zero or very negligible support (that is the trees contain forced bifurcations). I am calculating the mean bootstrap score for any given tree (I am using numerous variants of the alignment to see which produces the best supported tree on average), and I have solved the problem of nodes with value zero by simply dividing the sum of the node support scores by the number of non-zero nodes. Where I am encountering a problem, however, is that nodes with very negligible support (i.e. less than 10) are of course still being counted, and this results in the mean node bootstrap support jumping way up for any tree that contains nodes with zero support.
A simpler solution would be to collapse nodes with bootstrap scores below a given value into polytomies. This way I could simply calculate the mean node support in a straight-forward manner without worrying about zero nodes or nodes with unacceptably low support.
Using the 'phylo' class objects and functions that are present in 'Phangorn' and 'ape', could anyone advise on how to collapse nodes that have support scores below a user-set value?
Alternatively, is there a way I can ask the 'pratchet' function to produce trees that do not force bifurcations onto poorly (or zero) supported nodes?
Thank you for your help.
Thank you for this Brice. I will give di2multi() a try. After spending some time reading over the code what I have is a parsimony ratchet tree that is then tested for support by bootstrapping rather than a consensus by majority after bootstrapping. This is potentially why some of the node support scores are so very low (I would not expect such low scores with a consensus by majority). On a side note, I will take a look at what can be done about producing a consensus tree from this data and how it might alter my results.
I have just tried using di2multi() to collapse a zero node in a test data set but it is failing with the following error:
On examining the phylo object I get the following:
I'll have to generate edge lengths.
I solved this using acctran(tree, data) - it added edge lengths. I can now use di2multi() to collapse edges based on a tolerance setting.
Is there another way that this can be done without needing to generate edge lengths? In other words, rather than creating polytomies based on edge lengths shorter than a tolerable value, is it possible to collapse nodes that simply have a poor bootstrap support regardless of edge length?