Bio.Phylo cut the tree (python)
0
0
Entering edit mode
16 months ago
Fedor ▴ 10

I have a tree and I want to cut it on a given depth (red line on the pic) to produce separate subtrees. Basically what I need is smth like the dict with group ids (1-6 on the pic) as keys and the lists of sequences as values.

red line splits at 6 separate subtrees

Thank you for your kind assistance.

phylogeny cuttree biopython tree Bio.Phylo • 860 views
ADD COMMENT
2
Entering edit mode

Maybe you could add a code chunk so someone might have a better idea how to advise? I know that you can use find_clades to return a list of clades containing your target element and then the depths property might be what you need to 'cut' your tree.

I'm no expert just figured it should be possible so had a look at the docs for the object. Maybe there is an easier way

ADD REPLY
1
Entering edit mode

Ok, I've figured out how to do what I need. Thanks for the advise, I used both find_clades and depths

tree = Phylo.read('tree.nwk', 'newick')    

depth_threshold = 0.008
groups = {}
full_clade = []
cnt = 0
for key in tree.depths():
    if key in full_clade:
        # put it in group
        groups[str(cnt)] += [key]
        continue
    if tree.depths()[key] < depth_threshold:
        if str(key) != 'Clade':
            cnt += 1
            groups[str(cnt)] = [key]
        continue
    if str(key) != 'Clade':
        # separate seq to group alone
        cnt += 1
        groups[str(cnt)] = [key]
        continue
    if tree.depths()[key] >= depth_threshold:
        # save everything in this clade
        full_clade = key.find_clades()
        cnt += 1
        groups[str(cnt)] = []
        continue

print('Number of clades: ', len(groups))
print('clade\tn_seq')
for i in groups:
    print(i, '\t', len(groups[i]))
ADD REPLY

Login before adding your answer.

Traffic: 2489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6