Retrieving higly variable gene clusters from wide range of microbial genomes by similarity searches
1
0
Entering edit mode
4.7 years ago

Hi everyone, This is my first post on this forum, so I wanted to welcome and greet the whole biostar community. Currently in my work I am dealing with the synthesis of eps by soil bacteria. I wanted to compare gene regions covering eps synthesis genes for the whole genus but I don't quite know where to start. Should I download all available genomes for a particular genus or only the reference ones? Do you build a local database for BLAST or is it not necessary ? Knowing that individual genes may or may not be present in particular species, how to determine the gene range to be compared ? And finally, what software would you recommend for the comparison itself. I know Mauve and EasyFig. Or maybe something else ? I will be extremely grateful for all the answers !

microbial genomics comparative genomics • 879 views
ADD COMMENT
1
Entering edit mode
4.7 years ago
gayachit ▴ 200

Hello

I worked on microbial genomes quite some time back but from what I can tell if there are reference genomes available Mauve can be very easy to use for what you need. You could use genbank files and check whether particular genes are available or not. Besides this you could also search for single copy orthologs i.e. atleast a single copy exists for those genes in all the genomes. Plus you would also know which are unique for the genomes by this exercise.

ADD COMMENT
0
Entering edit mode

I managed to move on. I downloaded the reference genomes using ncbi-genome-download and loaded them into mauve. But I have a problem with the gbff files. Individual replicons (chromosome, plasmids) with annotation are loaded to mauve as separate sequences. Do you know any way to combine these sequences into one without losing the annotation ?

ADD REPLY
0
Entering edit mode

Hi I know that Mauve is very specific in terms of input files. It only recognizes .fna and .gbk. Try uploading .gbk files. Make sure you have Genbank full files

ADD REPLY
0
Entering edit mode

Hi, thanks for your response. I managed to merge particular replicons into single .gbk file and perform alignment for a few genomes but doing that for let's say 50 genomes seems to be extremely time-consuming.

ADD REPLY

Login before adding your answer.

Traffic: 1847 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6