OrthoFinder running time
1
0
Entering edit mode
7.0 years ago
pnatsidis • 0

Hello,

I am running the OrthoFinder software for 34 species, average 20-30,000 proteins, except 4 of them which have ~60,000 genes. The BLAST all-v-all took 11 days to finish, and now it's on the Running OrthoFinder algorithm step. However I have no clue how much it will last. Anyone that has run OrthoFinder with such big data so that I can have an estimate? I run it on 7 nodes, each node has 128gb RAM and 20 cores.

RNA-Seq next-gen blast sequencing gene • 4.7k views
ADD COMMENT
2
Entering edit mode
7.0 years ago
david_emms ▴ 160

Hi

OrthoFinder developer here!

The very first thing to say is that I've recently added the option to use DIAMOND instead of BLAST and I'm really impressed with it. From now on I would recommend virtually always using DIAMOND with OrthoFinder as it's about 60x faster in the tests I've done and the resulting OrthoFinder accuracy is virtually identical, you can see public benchmark results here: http://orthology.benchmarkservice.org/cgi-bin/gateway.pl

Back to your question, as a very rough guess I'd say less than 5 days to finish but probably quicker.

To add a bit more detail, I am a little surprised the BLAST calculations took that long though with the kind of computing power you've got there. For example, I've just run an analysis on 128 fungal genomes earlier this week, which I think should be pretty similar to your analysis as the total number of sequences is approximately the same (~990,000 versus ~1,280,000). I used only 1 node with 16 cores but used DIAMOND instead of BLAST. The total run time was under 19 hours to get all the orthogroups, gene trees and orthologues etc!

I am working on the performance as we speak though so there will be improvements over the previous versions at the moment. For example, the latest version uses a new method for getting orthologues from the gene trees instead of dlcpar, which has improved the accuracy and speed of this step significantly so is definitely worth considering if you're not using it already. They'll be a new paper coming out very soon which will provide details on these methods but feel free to email me or message here if I can help with any specific problems with the analysis you're currently running.

All the best

David

ADD COMMENT
0
Entering edit mode

Thanks a lot for your answer!! How can I run OrthoFinder with the option of DIAMOND instead of BLAST?

ADD REPLY
2
Entering edit mode

Install DIAMOND on your machine and have it in the system path so that you can call it using "diamond". Then when calling orthofinder you just need to add the option "-S diamond_more_sensitive" for version 2.0.0 or earlier of just "-S diamond" in future versions.

All the best David

ADD REPLY

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6