I have list of NCBI taxonomy ids with species names. I want to derive their species to species distance matrix. Does anyone know how to do that ?
I have list of NCBI taxonomy ids with species names. I want to derive their species to species distance matrix. Does anyone know how to do that ?
Hi,
I partially found the solution for the question I posted here. Before, I generate the distance matrix I need to generate the phylogenetic tree for given NCBI taxonomical ids. Given the NCBI taxonomical ids, phylogenetic tree can be generated using phyloT
. phyloT were free until recently. Unfortunately, tree building from phyloT is no more free. Once we get the tree, load that in the R using read.tree
command from the library(treeio)
. The loaded tree can be converted to the distance matrix using cophenetic
command given in the library(stats)
*P.S. I mentioned partially because phyloT tree generation is not free. If anyone knows phyloT alternative please mention. That would save my money :P .
Below is the alternate solution to phyloT (to save some money ;) to generate phylogenetic tree from NCBI taxonomical ids. Goto NCBI taxonomy browser. From the section Taxonomy Tools, select Common Tree. Upload NCBI taxonomy id list and download tree by the option save as --> phylip tree. The tree downloaded is in the multiline newick tree file format. One additional step need to be done to load that in the R.
treeText <- readLines(tree.phy)
treeText <- paste0(treeText, collapse="")
library(treeio)
tree <- read.tree(text = treeText) ## load tree
distMat <- cophenetic(tree) ## generate dist matrix
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
The
ete3
toolkit has a taxonomy database tool which allows you to do various queries with taxids and plugs in to their Phylo tools. I can't tell you much more than that, but you might be able to start there. It can do some interesting things like return you the minimal tree which spans all your taxids, so you might be able to traverse the tree to get what you need.I think this is a good suggestion. Maybe you can have a look at link1, although I am not sure whether you can obtain species distances from ete3.
Hi, good to know such a useful resource
ete3.
I will update here once I solve the problem. Thanks.