Efficient Gene Prediction on Large Eukaryotic Genomes
0
0
Entering edit mode
19 hours ago
Shuo • 0

I want to run gene prediction tools on a large eukaryotic genome dataset. I started with AUGUSTUS because it provides species-specific parameters that I need. However, I found that it runs quite slowly, often taking up to two hours to process a single FASTA file.

On the other hand, GlimmerHMM runs faster but lacks comprehensive species parameters. I am looking for gene prediction tools that can run efficiently while maintaining good accuracy.

Are there any faster gene prediction tools that still offer species-specific models? Besides using a large-scale computing cluster, what other methods can be used to accelerate gene prediction?

Thanks in advance for your suggestions!

tools prediction gene • 143 views
ADD COMMENT
1
Entering edit mode

Not sure why we are discussing this. Is it more important to you to get these gene predictions right, or to save 20-50 hours? Unless you have to submit a paper or defend a thesis / dissertation in two weeks, it shouldn't matter if you pick a higher quality predictor that takes a bit longer.

ADD REPLY
0
Entering edit mode

on a large eukaryotic genome dataset

and

I found that it runs quite slowly, often taking up to two hours to process a single FASTA file.

How many genomes are being analyzed? Unless it is hundreds (which would have taken significantly longer to assemble) how come something taking two hours considered "running slowly".

ADD REPLY
0
Entering edit mode

As others have said, 2 hours is nothing. Wait till you get to assembly or long read alignment. The quality of results is far more important.

If you want to optimize

  • learn a little patience <- start here
  • divide up the fasta by chromosome or contig and run augustus in parallel, then recombine results
ADD REPLY

Login before adding your answer.

Traffic: 2277 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6