Good evening @all,
I need help on how to parse lists of species list to ETE3 Toolkit.
I have like 1000 CDS which I need to build a phylogenetic tree for, each file contains a varied number of species with the maximum at 26 Spc in some files. I just needed the tree as an input for downstream analysis. Therefore, I used TimeTree to construct a species Tree using the 26 species (called it speciesName.nwk). I have extracted the species_ID from each of the Fasta files and store them as text_files. Now, I want to construct a tree for each text_file by pruning speciesName.nwk according to the number of species in each text_file, and save each tree with the name of the text_file.
I have been stuck at this point:
!/usr/python3
import ete3
from ete3 import Tree
import os
import glob
from pathlib import Path
p=Path('.')
t=Tree("speciesName.nwk", format=3)
print(t)
for files in p.glob("*.txt"):
with open(files, "r") as myfile:
data = myfile.read().splitlines()
t.prune(data)
For the t.prune(data), I get the following Error:
Traceback (most recent call last): File "etetest.py", line 16, in <module> t.prune(data) File "/home/vagrant/.local/lib/python3.6/site-packages/ete3/coretype/tree.py", line 531, in prune to_keep = set(_translate_nodes(self, *nodes)) File "/home/vagrant/.local/lib/python3.6/site-packages/ete3/coretype/tree.py", line 2601, in _translate_nodes raise ValueError("Node names not found: "+str(notfound)) ValueError: Node names not found: ['Myotis_brandtii', 'Myotis_lucifugus']
I need assistance Please, I am using python for the first time as a result of this ETE3 software.
Thank you All.