Construct Newick Tree from tab-delimited or csv file of phylogeny, preferably in Python
2
0
Entering edit mode
10.2 years ago
weslfield ▴ 90

So I am looking for a way, preferably a Pythonic way (packages are ok), to convert a tab-delimited or csv of hierarchical phylogenies into the classic Newick format for tree visualization.

Each line of the file has ['Phylum','Class','Order','Family','Genus','Species','Subspecies','gi'] as values and I would like to create a Newick tree representaiton. Any help greatly appreciated. Thanks!

newick phylogeny python tree • 9.8k views
ADD COMMENT
0
Entering edit mode

Hello weslfield!

It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/26146623

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
1
Entering edit mode
10.1 years ago
Asaf 10k

You can use dendropy.

The easiest way I can think of (without thinking too much) is for each line go from the phylum down to gi and create a child node or select a node you already created. Then you can export the tree in Newick format.

ADD COMMENT
0
Entering edit mode

Thanks Asaf, the code below in my answer addressed the problem using the node approach.

תודה רבה מטכניון :)

ADD REPLY
1
Entering edit mode

ד"ש לרותי

ADD REPLY
1
Entering edit mode
10.1 years ago
weslfield ▴ 90
import csv
from collections import defaultdict
from pprint import pprint

def tree(): return defaultdict(tree)

def tree_add(t, path):
  for node in path:
    t = t[node]

def pprint_tree(tree_instance):
    def dicts(t): return {k: dicts(t[k]) for k in t}
    pprint(dicts(tree_instance))

def csv_to_tree(input):
    t = tree()
    for row in csv.reader(input, quotechar='\''):
        tree_add(t, row)
    return t

def tree_to_newick(root):
    items = []
    for k in root.iterkeys():
        s = ''
        if len(root[k].keys()) > 0:
            sub_tree = tree_to_newick(root[k])
            if sub_tree != '':
                s += '(' + sub_tree + ')'
        s += k
        items.append(s)
    return ','.join(items)

def csv_to_weightless_newick(input):
    t = csv_to_tree(input)
    #pprint_tree(t)
    return tree_to_newick(t)
ADD COMMENT

Login before adding your answer.

Traffic: 1831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6