Entering edit mode
3.8 years ago
arunprasanna83
▴
60
Hello,
I have a tree with duplication. I need to extract a subtree that is monophyletic to get orthologs. I have to following tree:
t2 = PhyloTree("((RED_18455.t1:0.00147625,(YEL_2874.t1:0.0138986,YEL_23839.t1:0.0506878)n2:0.00563198)n1:0.000294125,BLU_7991.t1:0.00203445)n0;", format=1)
print(t2)
/-RED_18455.t1 /-| | | /-YEL_2874.t1 --| \-| | \-YEL_23839.t1 | \-BLU_7991.t1
R = t2.get_midpoint_outgroup()
t2.set_outgroup(R)
print(t2)
/-YEL_23839.t1 --| | /-YEL_2874.t1 \-| | /-RED_18455.t1 \-| \-BLU_7991.t1
t2.set_species_naming_function(lambda node: node.name.split("_")[0])
print(t2.get_species)
{'RED', 'BLU', 'YEL'}
for node in t2.split_by_dups():
print(node)
--YEL_23839.t1 /-YEL_2874.t1 --| | /-RED_18455.t1 \-| \-BLU_7991.t1
As you can see, there are two subtrees. How can I pick the second subtree that has all three species represented once? or extract a single copy orthogroup from this?
Thanks.