Question

Whole Genome Phylogeny

2

Entering edit mode

12.9 years ago

joe.loquasto ▴ 40

Hello,

I am interested in conducting whole genome phylogenetic analysis of whole genomes of closely related strains of bifidobacteria, with the hope of inferring a progenitor strain from the set. I am looking (possibly) for a program that will allow for the selection of genes that are conserved among all the strains and then construct a tree (if that is not the correct method, please let me know as well).

Thank you, Joe Loquasto

phylogeny • 4.2k views

ADD COMMENT • link updated 3.1 years ago by Ram 45k • written 12.9 years ago by joe.loquasto ▴ 40

score 2 · Answer 1 · 2012-06-06

Maybe you have already seen it, but there is a list of whole genome alignment tools in the bioperl wiki:

Whole genome alignment page at BioPerl Wiki

I am not sure if using a whole genome alignment is the best solution in this case. Usually the genomes of prokaryotes are usually sequenced with shotgun sequencing techniques, and they contain many repetitive regions and missing regions. Genome Alignment tools are usually designed to overcome these aspects, and not to detect conservation. So, it may be better to extract the sequence of the genes and align each gene separately, and then rank them based on dN/dS or another measure of conservation. This is just my opinion, I never worked on this type of systems personally.

score 1 · Answer 2 · 2012-06-08

1

Entering edit mode

12.9 years ago

joe.loquasto ▴ 40

CVTree was recommended to me. It seems easy enough to use. Anyone have experience using this program??

Joe

ADD COMMENT • link 12.9 years ago by joe.loquasto ▴ 40

Ram · Answer 3 · 2014-12-18

0

Entering edit mode

10.4 years ago

Chrispin Chaguza ▴ 280

It's a very old post but I thought I could add to it to help others who might want to do a similar analysis i.e. create phylogenies from whole genomes for prokaryotic species. I have created a basic analysis pipeline that tries to simplify the process of creating phylogenetic trees at species level using only the conserved (otherwise known as the core) genomic content of all the 'bacterial' species.

The steps used are described and the script is available at http://mcgp.sourceforge.net/

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 10.4 years ago by Chrispin Chaguza ▴ 280

0

Entering edit mode

I had tried but bug report as usual other tool.

ADD REPLY • link 10.3 years ago by HG ★ 1.2k

Ram · Answer 4 · 2014-12-18

0

Entering edit mode

10.4 years ago

5heikki 11k

Hal does it, although it's already quite old and possibly painful to get installed and working. I made a Bash script that users HMMER to extract ribosomal proteins from proteomes and then constructs a concatenated alignment from them with Muscle and GBlocks. Then I use RAxML and PhyloBayes for treebuilding. You could do something similar?

ADD COMMENT • link 10.4 years ago by 5heikki 11k

0

Entering edit mode

Can I use similar approach in nucleotide(whole genome sequence). If I align few genome using mauve after that do I need to extract the conserved region or else tree generated by mauve is enough to show phylogenetic relationship between all genome?

ADD REPLY • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by HG ★ 1.2k

0

Entering edit mode

Unless you're dealing with > 99% similar genomes, you should really do your analysis on the protein level.

ADD REPLY • link 10.3 years ago by 5heikki 11k

0

Entering edit mode

Could you please send the download link of the bash script i also want to do similar analysis.

ADD REPLY • link 10.3 years ago by jeccy.J ▴ 60