Extract Tree Topology From Ncbi Taxonomy Database
3
1
Entering edit mode
12.2 years ago
Lhl ▴ 760

Hi There,

I am wondering if i can extract tree for a given list of species using bioperl from NCBI taxonnomy database.

I hope some of you can help me figure this out.

Many thanks in advance.

Kind Regards,

Lhl

bioperl taxonomy • 6.0k views
ADD COMMENT
2
Entering edit mode

Not a BioPerl solution, but this python script will do the job: https://github.com/jhcepas/ncbi_taxonomy . I wrote some more info at http://jhcepas.cgenomics.org/?p=216

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Not exactly, but i think it is useful.I might use it in the future. Thanks a lot.

Lhl

ADD REPLY
0
Entering edit mode

Yes. This is exactly what i want.

Many thanks.

Lhl

ADD REPLY
0
Entering edit mode

I'm trying this method as well.

from ete3 import NCBITaxa
ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])
print tree.get_ascii(attributes=["sci_name", "rank"])

Can someone tell me how I can convert the output to a Newick file?

ADD REPLY
2
Entering edit mode
5.6 years ago
from ete3 import NCBITaxa

ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 9782])
print(tree.write(format=9, features=["sci_name", "rank"]))
ADD COMMENT
0
Entering edit mode

Thanks! This works as well. Is there a way to replace the TaxIDs with the scientific names?

ADD REPLY
1
Entering edit mode
5.6 years ago
AK ★ 2.2k
from ete3 import NCBITaxa
ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])
tree.write(features=["sci_name", "rank"], outfile="tree.nw")

gives you an output tree.nw:

(7707:1[&&NHX:sci_name=Dendrochirotida:rank=order],(((9606:1[&&NHX:sci_name=Homo sapiens:rank=species],9598:1[&&NHX:sci_name=Pan troglodytes:rank=species])1:1[&&NHX:sci_name=Homininae:rank=subfamily],10090:1[&&NHX:sci_name=Mus musculus:rank=species])1:1[&&NHX:sci_name=Euarchontoglires:rank=superorder],8782:1[&&NHX:sci_name=Aves:rank=class])1:1[&&NHX:sci_name=Amniota:rank=no rank]);
ADD COMMENT
0
Entering edit mode

Thanks for the fast responding! Works like a charm :)

ADD REPLY
0
Entering edit mode

Is there a way to replace the TaxIDs with the scientific names?

ADD REPLY
0
Entering edit mode

A handy way would be:

ete3 ncbiquery --tree --search 9606 9598 10090 7707 8782

You'll get a tree in Newick format and scientific names as node names:

ete3_ncbiquery_tree

ADD REPLY
0
Entering edit mode

You can do:

for node in tree.traverse():
    node.name = node.sci_name
ADD REPLY
0
Entering edit mode

Thanks SMK. I tried this but it does not work.

tree.write(features=["sci_name", "rank"], outfile="tree.nw")
t_file = Tree("tree.nw")

for node in t_file.traverse():
    node.name = node.sci_name

tree.write(features=["node.name"], outfile="labeled_tree.nw")

AttributeError: 'TreeNode' object has no attribute 'sci_name'

ADD REPLY
1
Entering edit mode

Hi peterlageweg603,

You have to traverse the one that you got from ncbi.get_topology():

from ete3 import NCBITaxa
ncbi = NCBITaxa()

tree = ncbi.get_topology([9606, 9598, 10090, 7707, 8782])

for node in tree.traverse():
    node.name = node.sci_name

print(tree.write())

This gives you:

(Dendrochirotida:1,(((Homo sapiens:1,Pan troglodytes:1)1:1,Mus musculus:1)1:1,Aves:1)1:1);
ADD REPLY
0
Entering edit mode

Ah thank you for the fast responding and help!! This is exactly what I needed :)

ADD REPLY
0
Entering edit mode
12.2 years ago
briano • 0

Lhl,

The Bioperl Bio::Tree::TreeFunctionsI module has a get_lca method. Is this what you're looking for?

Brian O.

ADD COMMENT
0
Entering edit mode

HI Briano,

I want to get species tree given a list of species. given a list containing (speciesA, species B and Species C), i want to know how to get tree for these species. Thank you anyway. I think i might need Bio::Tree::Tree and get_lca in the future.

Cheers

Lhl

ADD REPLY

Login before adding your answer.

Traffic: 4313 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6