Bacterial Annotation Pipeline
6
5
Entering edit mode
13.1 years ago
scapella ▴ 390

Hi,

Might be this one is an old question but I haven't found a real answer. Does anyone know an annotation pipeline (automatic or not) for working with bacterial species? In my case, there is not reference genome close to my species.

bacteria next-gen-sequencing • 11k views
ADD COMMENT
0
Entering edit mode

Thanks guys for your answers! I'll try RAST and BG7. Both look very promising!

ADD REPLY
0
Entering edit mode

Hopefully I'll be releasing and publishing my Prokka in early 2012.

ADD REPLY
0
Entering edit mode

Hello everybody,

Does anyone can give me solution? In fact, I annotated my genome sequence by PROKKA, but when I analysed my sequence by blast I found that some ORFs don't start or finish in the same location comparing to what annotated in blast. Is there an other server can give the good annotation and the good ORFs, or a server that I can use to correct manually?

Thank you very much

ADD REPLY
7
Entering edit mode
13.1 years ago
Marina Manrique ★ 1.3k

Hi!

We (at Oh no sequences!) have developed an annotation system specially designed for bacterial and NGS data. It's called BG7, probably the most interesting feature to you is that a close reference genome is not needed.

Unlike other annotation pipelines, like those based on ORF prediction with Glimmer, where your annotation strongly depends on having a close reference genome BG7 system works very well even when you don't have a reference genome. You just need a set of what we call 'reference proteins' that will guide the annotation, these proteins don't need to be too similar to the proteins you expect to find in your genome, so it's no problem if you don't have a close reference. We've tested it in lots of genomes (some of them with no similar sequences) and are very happy with the results.

The system is open-source (AGPL-V3 license) so you can freely use it.

We're about to launch its website, meanwhile you can take a look at these slides describing it and the results files of the E. coli Germany outbreak we published in this Github repository (the system gives the annotations in more format like gbk and embl, this is just an example of the annotations)

Please let me know if you want to know anything else, @pablopareja is the main developer, you can also ask him

HTH

Marina

EDIT: We've just launched the bg7 website http://bg7.ohnosequences.com/ please feel free to try it (any feedback is highly appreciated) :)

ADD COMMENT
0
Entering edit mode

Could you please let me know if there is any installation manual as well as how to run bg7??

ADD REPLY
5
Entering edit mode
13.1 years ago

RAST works really well.

RAST (Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating bacterial and archaeal genomes. It provides high quality genome annotations for these genomes across the whole phylogenetic tree.

ADD COMMENT
4
Entering edit mode
13.1 years ago
Scott Cain ▴ 770

The GMOD project has several alternatives, of which MAKER (mentioned above) is one, though it leans a little towards the euks. Another option which was designed for work with prokaryotes is DIYA (though looking at that page now it looks like SourceForge is messing with our wiki page). There is also Ergatis which was designed by the people at TIGR/JCVI for doing bacterial annotation, which they know how to do very well (they are now at the University of Maryland). Ergatis is by far the most powerful, but overkill to install if you are only doing one genome. If you are only doing one genome, you might want to look at CloVR, which I am pretty sure is powered by Ergatis but is inside a virtual machine that you can download and run (I think they have options for running it on the cloud too, but I haven't talked to them in a while).

ADD COMMENT
0
Entering edit mode

any update on DIYA? I would like to include it as a Galaxy module for routine annotation of environmental clones. However its seems like it hasn't seen much action in awhile.

ADD REPLY
3
Entering edit mode
13.1 years ago

It takes a bit time to set up, but try MAKER.

ADD COMMENT
1
Entering edit mode
9.8 years ago
dago ★ 2.8k

PROKKA is quite good and fast and you do not need any reference genome.

It perform for you ORF prediction and annotation using several well established tools.

ADD COMMENT
0
Entering edit mode
9.8 years ago
wrf ▴ 70

This thread seems to have died despite this not being a solved problem. One could also check PRODIGAL. It does a very fast annotation of proteins, like 10 seconds. It is a single binary to download and running is fast since bacterial genomes are small. If it doesn't work, then not much time is lost.

ADD COMMENT

Login before adding your answer.

Traffic: 1626 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6