Scale up paralog synteny in 200 bacterial genomes
0
1
Entering edit mode
9.8 years ago
biotech ▴ 570

I would like to study the conserved synteny among a protein family in 200 bacterial genomes. This protein family has, for instance, 13 paralogs in one of the strains.

I've designed a strategy to do this but maybe someone knows a better or faster solution. Maybe can be approximated with a genomic island finding approach [1].

We have some clues that mobile genetic elements are present near these regions and could be playing a role in the evolution of the family proteins, being inside genomic islands.

I've thought a couple of approximations to deal with this problem:

  1. Whole genome comparison of the 200 genomes. However, this will tell me if the genes are or not present in each genome but not where they are.
  2. Automate the classic synteny process. For each 5' and 3' gene of each protein of the family, perform a BLAST against the genomes/predicted proteomes of each of the 200 strains. If found the gene, print 8 Kb 5' and 8 Kb 3'. Store the sequences of each strain in a file. I will end up with 200 files. Once performed this step, compare the proteins of each strain and report a presence/absence matrix.

Thanks

[1] Nat Rev Microbiol. 2010 May;8(5):373-82. doi: 10.1038/nrmicro2350. Detecting genomic islands using bioinformatics approaches. Langille MG1, Hsiao WW, Brinkman FS.

http://www.ncbi.nlm.nih.gov/pubmed/20395967

UPDATE: I'm now reading about some tools for large scale synteny tasks: ("Synteny/genetics"[Mesh]) AND "Software"[Mesh]

genomics synteny bacteria • 1.9k views
ADD COMMENT

Login before adding your answer.

Traffic: 1697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6