Bootsrapped tree from a distance matrix
2
0
Entering edit mode
10.2 years ago
5heikki 11k

Suppose you start out with a distance matrix. With e.g. PHYLIP, you can use neighbor, kitsch and fitch to turn such matrix into a tree. However, how would you go about getting a bootstrapped tree from a distance matrix?

phylogenomics • 7.0k views
ADD COMMENT
1
Entering edit mode
10.2 years ago

You must subsample your alignments to get many distance matrices. These matrices can then be used to build a consensus tree and add bootstrap supports.

ADD COMMENT
0
Entering edit mode

But my distance matrix is not based on alignments, well not directly anyway.

ADD REPLY
1
Entering edit mode

If you can subset whatever data the distance matrix is based on you can bootstrap.

ADD REPLY
3
Entering edit mode

This Q has information on bootstrapping from non-alignment data. In this case it uses an R function (boot.phylo) to wrap up the bootstrap process. But reall, all you need to do it "re sample with replacement" column-wise on whatever data you used to make you distance matrix.

ADD REPLY
0
Entering edit mode

I have assigned the lowest common ancestor to every protein of some 1k proteomes. Then from this I have created a frequency table and transformed that into an euclidian distance matrix. I'm not quite sure how I would go about subsetting this matrix. Any hints?

Also at David W, did you forget to enter a link?

ADD REPLY
1
Entering edit mode

I did forget the link

Bootstrapping => consensus tree construction based on distance matrices

I'm not sure what you mean by "lowest common ancestor" for a single protein, or what you are aiming at with the analyses so can't provide much more help

ADD REPLY
0
Entering edit mode

So, I have ca. 1k proteomes belonging to three families, and I've assigned a last (wrote lowest above for some reason) common ancestor to every protein. So in a proteome you can have e.g. proteins with their LCA being the LCA of all three families (rank would be order of..), whereas in other cases the LCA can be the LCA of the specific family, or some other taxonomic rank. Anyway, the link seems very informative. Thanks.

ADD REPLY
1
Entering edit mode
10.2 years ago

I don't have much experience with trees but by googling I see there is the pvclust package which takes as input a (distance) matrix and performs bootstrap on the cluster analysis:

pvclust is an R package for assessing the uncertainty in hierarchical cluster analysis. For each cluster in hierarchical clustering, quantities called p-values are calculated via multiscale bootstrap resampling. P-value of a cluster is a value between 0 and 1, which indicates how strong the cluster is supported by data.

Could it help?

ADD COMMENT
0
Entering edit mode

I actually used it for this very task. I was hoping to uncover more ways, hopefully using more 'traditional' packages to build more trees for comparisons. Also, based on my observations so far, PHYLIP's neighbor with nj resulted in a 'better' tree than pvclust with ward clustering and euclidean distances. It could be that I haven't nailed the 'best' settings for pvclust though. Another problem with pvclust is that I don't know how to get a nexus tree out from it so that I could edit it further in FigTree.

ADD REPLY
0
Entering edit mode

If you are trying to do a serious "pylogenetic reconstruction" you should try Mr Bayes or Garli.

ADD REPLY

Login before adding your answer.

Traffic: 912 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6