I have seen increasing interest in workflow/pipeline management systems such as snakemake
and nextflow
. In my opinion, both seem very interesting and very promising. There is a very interesting review from 2016 in which bash
, make
, snakemake
and nextflow
were compared: https://www.jmazz.me/blog/NGS-Workflows
The author of that review did a very good job of analyzing the strengths and weaknesses of snakemake
and nextflow
. I am not sure how much has changed since then, but in your experience, what would be some criteria that bioinformaticians could consider to choose one over the other? Have some of the identified weaknesses of both snakemake
and nextflow
have been addressed since then?
I started using snakemake 6 months ago, and now I have shifted all my pipelines to snakemake (ChIP-seq, RNA-seq, ATAC-seq and DNA-seq). I am pretty happy with it. once you get the idea of how snakemake works (think in a bottom-up fashion), it is easy to build up your own pipelines. BTW, the documentation is awesome.
you can write a customer script for submitting jobs to the cluster for each platform (LSF, moab...) if you want more control of your jobs. e.g. https://bitbucket.org/snakemake/snakemake/issues/28/clustering-jobs-with-snakemake
only downside for me is that when I have more than 1000 jobs to submit, it takes time for snakemake to process the metadata associated with each job. For a dry-run, it takes minutes. I do not know how fast nextflow is.
And BioMake is off the game? It uses prolog (which is both the weakness and the strength...).
Hello, @kamiljaron. I was not aware of BioMake; I would have to read up on it. I do not know prolog, nor have I ever used a logical programming language, but I will read more about what BioMake has to offer.
No knowledge of prolog required! You can use gnu make syntax to specify your workflow
How to compare
WDL/CWL
andsnakemake
?never used either of them but CWL is just a specification for how to describe a pipeline, it does not actually execute a pipeline itself. Snakemake executes the pipeline in addition to describing it with its own syntax