Suppose you start out with a distance matrix. With e.g. PHYLIP, you can use neighbor, kitsch and fitch to turn such matrix into a tree. However, how would you go about getting a bootstrapped tree from a distance matrix?
Suppose you start out with a distance matrix. With e.g. PHYLIP, you can use neighbor, kitsch and fitch to turn such matrix into a tree. However, how would you go about getting a bootstrapped tree from a distance matrix?
You must subsample your alignments to get many distance matrices. These matrices can then be used to build a consensus tree and add bootstrap supports.
I don't have much experience with trees but by googling I see there is the pvclust package which takes as input a (distance) matrix and performs bootstrap on the cluster analysis:
pvclust is an R package for assessing the uncertainty in hierarchical cluster analysis. For each cluster in hierarchical clustering, quantities called p-values are calculated via multiscale bootstrap resampling. P-value of a cluster is a value between 0 and 1, which indicates how strong the cluster is supported by data.
Could it help?
I actually used it for this very task. I was hoping to uncover more ways, hopefully using more 'traditional' packages to build more trees for comparisons. Also, based on my observations so far, PHYLIP's neighbor with nj resulted in a 'better' tree than pvclust with ward clustering and euclidean distances. It could be that I haven't nailed the 'best' settings for pvclust though. Another problem with pvclust is that I don't know how to get a nexus tree out from it so that I could edit it further in FigTree.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
But my distance matrix is not based on alignments, well not directly anyway.
If you can subset whatever data the distance matrix is based on you can bootstrap.
This Q has information on bootstrapping from non-alignment data. In this case it uses an R function (boot.phylo) to wrap up the bootstrap process. But reall, all you need to do it "re sample with replacement" column-wise on whatever data you used to make you distance matrix.
I have assigned the lowest common ancestor to every protein of some 1k proteomes. Then from this I have created a frequency table and transformed that into an euclidian distance matrix. I'm not quite sure how I would go about subsetting this matrix. Any hints?
Also at David W, did you forget to enter a link?
I did forget the link
Bootstrapping => consensus tree construction based on distance matrices
I'm not sure what you mean by "lowest common ancestor" for a single protein, or what you are aiming at with the analyses so can't provide much more help
So, I have ca. 1k proteomes belonging to three families, and I've assigned a last (wrote lowest above for some reason) common ancestor to every protein. So in a proteome you can have e.g. proteins with their LCA being the LCA of all three families (rank would be order of..), whereas in other cases the LCA can be the LCA of the specific family, or some other taxonomic rank. Anyway, the link seems very informative. Thanks.