Is there any neat way to do this? Ie to quickly get a core genome alignment from de novo assemblies and create a phylogenetic tree? I have hundreds of samples..
Is there any neat way to do this? Ie to quickly get a core genome alignment from de novo assemblies and create a phylogenetic tree? I have hundreds of samples..
Hello Samuel!
Not sure about neat and quickly. It depends on the biology of your species.
With very small genomes (viruses, 10k) you may try to
If you are working with bacteria (5M) or eucariotic genomes (1M-3G), and you have different species, you may try to extract 16S/18S rRNA sequences and build trees upon them.
If you have individuals of the same species, or if you want more precise phylogeny, it is a longer story.
You have to annotate your genomes - mark up where genes are. It depends on species, for non-model organisms it may become a whole enterprise.
When you have you genes annotated, you have to
You may use not all genes, but a smaller subset of 10-100 genes, the most conservative, housekeeping genes. The set is different for Bacteria/Eukarya.
Good luck!
SN
This is a very useful comment! One addendum is that Orthofinder (a proteome clustering software similar to OrthoMCL but with more features and it accounts for some weirdnesses in the alignments) also gives you a file of one-to-one clusters as an easier start for the concatenated (MUSCLE?) aligment
Edit: Orthofinder also gives you a species tree but I have not really understood how it's being made except based on the distance matrix that is somehow generated from the clusters. That phylogeny does not contain bootstrap values or other confidence measurements, but it's a good start
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Are you just looking for software that can create phylogenetic trees? If so, programs like SeaView work with alignment.
Yeh but seaview cannot cope with hundreds of bacterial genomes, it crashes..