Can someone compare and contrast the following scientific workflow frameworks?
Taverna http://www.taverna.org.uk/
Pegasus http://pegasus.isi.edu/
Can someone compare and contrast the following scientific workflow frameworks?
Taverna http://www.taverna.org.uk/
Pegasus http://pegasus.isi.edu/
What sort of criteria did you have in mind? Yes, they're both (fine) workflow systems but they're rather different from each other.
Pegasus is focused on coordinating programs that transform files into other files, which is definitely a powerful paradigm. In tooling terms, it works through command line tools only: if you're working with complex workflows, you'll _definitely_ want to have an extra tool around it to generate the workflow for you from some higher-level description. The tooling is such that it is relatively easy to use it behind websites, with other programs, etc. Because you typically generate the workflow programatically, you can easily adapt the workflow according to your particular requirements (provided you're a programmer).
Taverna is much more of an integrated system, and it works at a higher level of abstraction — not files, but rather the data that you might find in them, and not programs, but rather the operations you might do with them. It comes as a GUI workbench for creating the workflows, running them and visualizing them, a command line tool for just running them, and a server for running workflows when you need more power (e.g., to support a portal). The workflows are comparatively awkward to edit by mechanisms other than using the workbench.
To look at the other things you mention in your comment:
make
and neither works like it. Yes, they all support branching, parallel execution and data merging, but some of the other things you mention under this just don't make sense._Declaration: I'm one of the Taverna developers._
What criteria do you want them compared with?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
by tracking I mean when run on a cluster using a job scheduler does the workflow system track what a process is doing on a node and present this to the user?
Looks like this is an old question and the answer-er here has not been around for a while; does anyone else know if Taverna is able to use something like an HPC cluster running SGE? It would have to submit jobs e.g. with
qsub
and then monitor them for completion, etc.