Entering edit mode
6.8 years ago
A Soggy Waffle
•
0
Hey all, I have a project where around 250 prokaryote genomes will be analyzed, so every step must be quick and scalable. I have been searching but have been unable to find any resources that talk about the running time or rate limiting step of different gene prediction tools/algos. Do any of you know of such information? Also, how would you suggest I could do gene prediction of so many genomes in a timely manner? FYI the genomes are not of the same species e.g. Escherichia sp. I was imagining a metagenomics approach would be best, but what do you think?
How quick do you need it to be? Prokka runs for most bacterial genomes on the order of minutes. You could easily annotate all 250 genomes in a couple of days I should think.
Faster is better. My job is just prediction, no functional annotation needs to be done.
I would guess GLIMMER or Prodigal would be the go-to tools for this (if it's all prokaryotic) but I don't have any running times for them. Prokka uses Prodigal internally though, so it should be significantly faster than the running time of prokka.