How can I find the closest related organism (meaning by gene similarity or phylogenetic distance) for a given bacterial species? Also, how could I compare 2 species to determine how closely related they are genetically?
Thanks,
Greg
How can I find the closest related organism (meaning by gene similarity or phylogenetic distance) for a given bacterial species? Also, how could I compare 2 species to determine how closely related they are genetically?
Thanks,
Greg
The quick method is to compare sequences of the two large rRNA sequences in and bridging conserved regions. It is in these conserved regions that PCR primers are used to amplify segments for metagenomic sequencing projects. So, with that sequence in hand, a simple BLASTN search against other bacterial genomes should give you the closest relative - in a manner similar to how such would be identified in a metagenome article.
I cannot address your second question as deeply as I think it deserves, but I would begin by comparing whole genome to whole genome both in terms of similarity but also in terms of conserved gene order and operon content. I would try to get at questions like how many operons are exactly conserved? How many operons show +1/-1 difference in gene count between the two species? How many operons show same order (in terms of genes within, but more importantly in terms of conserved operon neighbors)? This is synteny.
To compare 2 species to determine how closely related they are, you can use SiBELia (Synteny Block ExpLoration tool) — it's able to compare bacterial genomes and visualize it in a useful way, see http://bioinf.spbau.ru/en/sibelia (description) or https://github.com/bioinf/Sibelia (code).
I don't have experience in whole genome comparison. But anyway, I would start with a 16S rRNA comparison using greengenes (not only because it's my lab's tool ;-) ) or RDP classifier.
For the closes relative you can try using a 16S rRNA database such as the Ribosomal Database Project, Silva or Greengenes. In RDP you can use the seqmatch function.
For some groups, 16S may not provide the best resolution so you will have to use another gene such as recA or rpoN, For even better resolution you can use multiple concatenated genes (MLST). If you have the genome available, you can use Average Nucleotide Identity to compare genomes as it has better resolution than 16S and good correlation with DNA-DNA hybridization values which are the standard for defining species.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Is your bacteria of interest sequenced?