Comparative Genomics - Make phylogenetic tree of species with available NCBI genomes
3
1
Entering edit mode
8.0 years ago
dacotahm ▴ 20

Hello,

I'm doing some comparative genomics and NCBI's taxonomy browser is useful for screening species with genomes available. I want to export this in Newick or some standard format so I can visualize it easily in R or ITOL.

There isn't a mechanism to export a list though...

Is there a method that doesn't take much time to pull species or IDs from NCBI that have published genomes and create an exportable tree? I don't do this often enough to know many of the resources.

Specifically, I want a tree of all the species in Nematocera which have genomes.

Link to the specific NCBI page

Thanks for any help!

phylogenetics genome ncbi comparative • 3.8k views
ADD COMMENT
0
Entering edit mode

One would need to pick a gene that is present in all those genomes and build a tree after doing a MSA, correct? Or are you only looking to get a tree-like representation of the information on the page you linked?

ADD REPLY
0
Entering edit mode

The second - Lets say I want to build a tree of all Dipterans for which NCBI has a genome available. That linked page shows the list of species screened for genome, and the phylogeny for each species. I would like to find a way to access that information and export it as a Newick or some other format I can use to build visuals.

This website can access NCBI phylogenetic information, but I would like to find out how to screen out only the species in a clade with genomes

ADD REPLY
0
Entering edit mode

Hi. I have problem in trimmomatic: Exception in thread "main" java.lang.NumberFormatException: For input string: "10‬‬" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.base/java.lang.Integer.parseInt(Integer.java:652) at java.base/java.lang.Integer.parseInt(Integer.java:770) at org.usadellab.trimmomatic.trim.IlluminaClippingTrimmer.makeIlluminaClippingTrimmer(IlluminaClippingTrimmer.java:56) at org.usadellab.trimmomatic.trim.TrimmerFactory.makeTrimmer(TrimmerFactory.java:32) at org.usadellab.trimmomatic.Trimmomatic.createTrimmers(Trimmomatic.java:59) at org.usadellab.trimmomatic.TrimmomaticSE.run(TrimmomaticSE.java:303) at org.usadellab.trimmomatic.Trimmomatic.main(Trimmomatic.java:85)

ADD REPLY
0
Entering edit mode

How do you think this relates to the original question?

ADD REPLY
2
Entering edit mode
5.2 years ago
Mensur Dlakic ★ 28k

To download genomes:

https://github.com/kblin/ncbi-genome-download

To build a concatenated tree from single-copy genes:

https://github.com/yuwwu/ezTree

ezTree is meant for prokaryotes, but it is not difficult to modify it so it works with eukaryotic genomes. Not sure if you want to deal with that, though.

You can always use BUSCO to identify single-copy genes present in all or most of your species of interest, then concatenate the alignments and build a tree manually.

ADD COMMENT
1
Entering edit mode
5.2 years ago
Juke34 8.9k

You can use ncbi_get_genome_tree.pl from the GAAS toolkit. It creates a tree in nh format.

e.g for all mammals genomes available (where 40674 is the mammals taxid):

ncbi_get_genome_tree.pl -t 40674
ADD COMMENT
0
Entering edit mode
4.1 years ago

Check the excellent GToTree software https://github.com/AstrobioMike/GToTree

ADD COMMENT

Login before adding your answer.

Traffic: 2357 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6