Hi,
This is for members who might be knowledgeable about the Robinson-Foulds distance. I have a fixed tree T and a bunch of other trees B.
I use the Robinson-Foulds (RF) distance to compare the trees in B with T, and I get a distribution of distances. I'd like to show that this distribution is significantly different from random. In other words, my null hypothesis is that there is no relationship between the trees of B and the RF-distance. In order to contradict this hypothesis, it would help to know how the trees distances from T are distributed if I include all the possible trees.
Hence the question : is it possible to know the number of trees at distance k from a fixed tree T ?
By the way, while this solves the original problem, I don't think it has any practical application to phylogentics, because:
Yes thank you. I didn't go into the details but ignoring branch lengths and possible trees should be sufficient for the analysis at hand. I also forgot to mention that all the trees have the same number of nodes. The real question, which I should have made more precise, was "Can we find the number of trees at distance k analytically, purely by mathematical methods and without resorting to generating all non-isomorphic trees ?". But after searching and asking around, I'm getting pretty convinced that this hasn't been solved and is likely to be a tough problem. So I'll probably do as you suggest. Thank you for your answer.
While the number of tree topologies is finite, it becomes intractable for even modest taxon sets. Felsenstein (1978; http://tinyurl.com/zdnsn48) gives the equation for the number of labelled bifurcating rooted trees for n taxa:
(2n-3)! / [2^(n-2) * (n-2)!]
The number of unrooted trees for x taxa can be obtained from the equation above by setting n = x - 1.
For example, for just 10 taxa the number of unrooted trees is 2,027,025 and the number of rooted tree is 34,459,425.