Tool:Snakemake--Python-Based Workflow Management
0
18
Entering edit mode
11.0 years ago

From the snakemake website:

Build systems like GNU Make are frequently used to create complicated workflows, e.g. in bioinformatics. This project aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style:

rule targets:
    input:  'plots/dataset1.pdf', 
            'plots/dataset2.pdf' 
rule plot:
    input:  'raw/{dataset}.csv'
    output: 'plots/{dataset}.pdf'
    shell:  'somecommand {input} {output}'

Like with GNU Make, in Snakemake you first specify targets in terms of a pseudo rule, and then how they are created via one or more steps of subsequent rule applications. Rules can be generalized via wildcards (here {dataset}). Everything is propagated top-down, i.e. here Snakemake determines that for the file "plots/dataset1.pdf" the rule plot has to be applied with wildcard {dataset} = dataset1 to the file raw/dataset1.csv. How the files are created is specified either with a shell command or python code. Further, Snakemake can interface with R to specify R code inside rules. Also see the FAQ to get an impression of the basic idea behind Snakemake.

workflow snakemake python • 11k views
ADD COMMENT
1
Entering edit mode

Simple introduction, with example/comparison of the same pipeline made with both perl and snakemake here: https://bitbucket.org/johanneskoester/snakemake/wiki/Getting%20Started%20with%20Snakemake%20and%20Qsub This tool is excellent.

ADD REPLY
1
Entering edit mode

Too bad it only works with python 3. Does anyone have a solution to make it work without having to get python 3?

ADD REPLY
1
Entering edit mode

I don't have a solution to that, but a very easy way to install python 3 locally without needing root access, is the pyenv program.

After installing pyenv, you can install it with e.g. (given that 3.4.1 is the version you want to install):

pyenv install 3.4.1

... and then activate that version in your local shell:

pyenv shell 3.4.1

... or globally with:

pyenv global 3.4.1
ADD REPLY
0
Entering edit mode

a snakemake solution here: http://coderscrowd.com/app/codes/view/192

ADD REPLY
0
Entering edit mode

Too bad it only works with python 3. Does anyone have a solution to make it work without having to get python 3?

ADD REPLY
1
Entering edit mode

Unfortunately, python 2 is not possible because of missing functionality in the multiprocessing module. However, having a python 3 setup in your home directory is very easy with virtualenv, or even without, using ~/.local as a prefix.

ADD REPLY
1
Entering edit mode

An example of how to do this might be worth a FAQ entry since I think python3 is one of the main stumbling blocks to greater uptake.

ADD REPLY
2
Entering edit mode

I have added the virtualenv setup to the documentation

ADD REPLY
1
Entering edit mode

Cool, I would propose adding documentation for setup with pyenv, since it takes away a lot of the complexity with the virtualenv / virtualenvwrapper / virtualenv-burrito mess.

ADD REPLY
1
Entering edit mode

Anaconda is an excellent and completely free Python distribution. It installs python 3 with all those sometimes hard to install mathematical tools like numpy, scipy, matplotlib and a few others.

*Installs cleanly into a single directory

*Doesn’t require root or local admin privileges

*Doesn’t affect other Python installs on your system, or interfere with OS X Frameworks

https://store.continuum.io/cshop/anaconda/

ADD REPLY
0
Entering edit mode

Great! I'll have to try that out

ADD REPLY
0
Entering edit mode

What's wrong with the good old robust GNU Make? Looks I have to type less there, while having the same functionality? Any main advantages of snakemake that I miss?

ADD REPLY
2
Entering edit mode

Much easier to code, able to easily use python (and all the libraries included) in the makefile (but also shell scripting), possible with several input output files for each rule (even if the exact number not known in advance), and is easy to generalize so it works with many different datasets: https://bitbucket.org/johanneskoester/snakemake/wiki/Documentation#markdown-header-wildcards

ADD REPLY
1
Entering edit mode

Snakemake is designed to work in a cluster environment. I routinely run snakemake over 3000 cores.

ADD REPLY
0
Entering edit mode

Interested in learning more about #SNAKEMAKE?

Register now for the first 2-day #SNAKEMAKE Workshop in Berlin with Johannes Köster https://johanneskoester.bitbucket.io/

You will learn how to create modern and reproducible #bioinformatic workflows

ADD REPLY

Login before adding your answer.

Traffic: 2287 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6