Hi,
Does anyone know of any whole genome alignment tool that can handle 200-300 bacteria genome?
Thanks
Hi,
Does anyone know of any whole genome alignment tool that can handle 200-300 bacteria genome?
Thanks
For that many bacterial genomes, whole genome alignment would either be impractical or might take very long time and require lots of computational resources to properly align them especially if working with a species that has high rates of genetic recombination/horizontal gene transfer such as Streptococcus pneumoniae. In such species, most people usually work with a core genome alignment (merged alignment of all the genes present in a species/dataset) for phylogenetic analysis etc. This approach would not be appropriate if you're particularly interested in non-coding regions for example, rearrangements etc.
Tools that you could use are; GET_HOMOLOGUES (http://www.ncbi.nlm.nih.gov/pubmed/24096415), ROARY (https://github.com/sanger-pathogens/Roary), CD-HIT (http://weizhong-lab.ucsd.edu/cd-hit/download.php).
Please have a look: http://mugsy.sourceforge.net/
+1 for Mauve. It is now located at: http://darlinglab.org/mauve/mauve.html You will need access to a machine with plenty of RAM for working with 200+ genomes.
Yes, its need plenty of ram according to your data set. But for such a huge data set core genome alignment will be always better than whole genome alignment. Tools like HarvestTools could be better choice.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Along with the suggestions below, you can also consider
SyMap
which used Mummer and Promer for genome alignment.