Entering edit mode
4.9 years ago
Chvatil
▴
130
Hi, I need some help getting a phylogenetic distance matrix.
Here are two examples:
Example one:
tree=ete3.Tree('(((A,B),C),D);')
print(tree)
/-A
/-|
/-| \-B
| |
--| \-C
|
\-D
The matrix should then be :
A B C D
A 0 1 2 3
B 1 0 2 3
C 2 2 0 3
D 3 3 3 0
As you can see A and B are the closest leaves, then C is closer to A and B than it is to D and finally the furthest leaf is D.
Here is another more complex example 2:
tree=ete3.Tree('((((((A,B),C),D),(E,F)),G),(H,I));')
print(tree)
/-A
/-|
/-| \-B
| |
/-| \-C
| |
/-| \-D
| |
| | /-E
/-| \-|
| | \-F
| |
--| \-G
|
| /-H
\-|
\-I
and here I should get the followgin matrix:
A B C D E F G H I
A 0 1 2 3 4 4 5 6 6
B 1 0 2 3 4 4 5 6 6
C 2 2 0 3 4 4 5 6 6
D 3 3 3 0 4 4 5 6 6
E 4 4 4 4 0 1 5 6 6
F 4 4 4 4 1 0 5 6 6
G 5 5 5 5 5 5 0 6 6
H 6 6 6 6 6 6 6 0 1
I 6 6 6 6 6 6 6 1 0
I tried get_distance functions on ete3 but it does not give matrix based on node distance...
Thank you , but
pdm.to_csv('/path/to/output.csv')
givesAttributeError: 'PhylogeneticDistanceMatrix' object has no attribute 'to_csv'
Anyway I tried :But I get only distance of zero between leaves
...
Ah sorry I think its
.write_csv()
. Check the package documentation.You are getting zero distances, because your tree is topological only - it has no branch lengths. You can artificially 'fudge' this by making a cladogram of your tree, and just set all the distances to
1
. Effectively your nodes have no distance in the normal sense for a tree, just hierarchical relationships.I'm not aware of any built in functionality myself to calculate this based just off the 'rank'/'cardinality' of the nodes. It would be doable in principle by calculating the pairwise node ranks etc, but thats far more work than just faking a cladogram and using the built in methods.