I want to run gene prediction tools on a large eukaryotic genome dataset. I started with AUGUSTUS because it provides species-specific parameters that I need. However, I found that it runs quite slowly, often taking up to two hours to process a single FASTA file.
On the other hand, GlimmerHMM runs faster but lacks comprehensive species parameters. I am looking for gene prediction tools that can run efficiently while maintaining good accuracy.
Are there any faster gene prediction tools that still offer species-specific models? Besides using a large-scale computing cluster, what other methods can be used to accelerate gene prediction?
Thanks in advance for your suggestions!
Not sure why we are discussing this. Is it more important to you to get these gene predictions right, or to save 20-50 hours? Unless you have to submit a paper or defend a thesis / dissertation in two weeks, it shouldn't matter if you pick a higher quality predictor that takes a bit longer.
and
How many genomes are being analyzed? Unless it is hundreds (which would have taken significantly longer to assemble) how come something taking two hours considered "running slowly".
As others have said, 2 hours is nothing. Wait till you get to assembly or long read alignment. The quality of results is far more important.
If you want to optimize