Hi, I wish to build a phylogenetic tree for a thousand microbial genomes. The diversity should be high. Can any one recommend on steps or a pipeline to do so? I was thinking about 16s tree but open to any other suggestion. Thanks
Hi, I wish to build a phylogenetic tree for a thousand microbial genomes. The diversity should be high. Can any one recommend on steps or a pipeline to do so? I was thinking about 16s tree but open to any other suggestion. Thanks
There is already a tree that includes 100,000+ bacterial species.
I am not sure what exactly you are trying to achieve, but it is for practical purposes impossible to look at, or to publish, a tree that has more than a few hundred branches. It is also practically impossible to calculate this tree without taking some shortcuts, which means using FastTree and hash-based trees as suggested or other approximations.
My suggestion to you is be sure you know what you are doing at what you are trying to achieve, as this is a non-trivial task that can take many months to complete and still not be appreciated either by general public or by reviewers specifically. I think one can easily make a diverse bacterial tree with 150-200 entries, and they should be done with concatenated single-marker genes rather than 16S rRNA (see the exact procedure at that link above).
Thank you for the reply. The goal is that to show the presence of a specific protein family across the diversity of genomes. I was thinking to use mash based methods but I'm afraid that the diversity might be too high. So thought to use 16S, but as you say, it contains several steps which I want to make sure doing correctly.
How would you show "the presence of a specific protein family across the diversity of genomes" by using 16S rRNA?
If I were a reviewer, you would convince me just fine that a given protein family is widely distributed by showing its presence in 100-200 well-chosen genomes as you would with 1000 genomes. The difference is that the former is easier to execute and to actually inspect the tree.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
FastTree: http://www.microbesonline.org/fasttree/
Create a tree using Mash distances: https://github.com/lskatz/mashtree
Sourmash: https://sourmash.readthedocs.io/en/latest/tutorial-basic.html#compare-many-signatures-and-build-a-tree