Building a pipeline
1
0
Entering edit mode
9.1 years ago
skbrimer ▴ 740

Greeting oh great and powerful collective hive mind,

The analysis I want to perform is very similar to the viral-ngs pipeline created at Broad for the ebola analysis (Park et al., 2015, Cell 161, 1516-1526, June 18) however I'm not working with ebola or illumina data so I would like to tweak the pipeline for what I am doing.

I'm working on building a pipeline for my viral-ngs analysis and I'm wanting to use snakemake. I'm working through the tutorial and the slides but I have a few question because I'm not a programer by training I'm having some trouble understanding some things.

It seems like snakemake works like or is a virtual python environment and I'm not sure how to get the programs I want to use into the virtual environment. I'm reading the docs on Python packaging for virtual environments and it says to use pip inside the virtual environment so would I use pip inside my snakemake environment?

I know that to create a new workflow it is,

conda -n myworkflow

I feel okay with the making a config file and a Snakefile, I'm just not sure how to set my workflow up to have all the programs and dependencies I need/want in order to run the pipeline.

RNA-Seq analysis ngs snakemake reporting • 2.8k views
ADD COMMENT
1
Entering edit mode

Isn't it just using your $PATH like everything else?

ADD REPLY
0
Entering edit mode

I do not think so. It seems to have something to do with a requirements file, for example the tutorial has a requirements file that sets up the environment but it is just a list of files or packages. i.e

bcftools=1.2
bwa=0.7.12

then you run the create code with the requirement file and it collects the needed packages but I do not understand how it does it or where it pulls from.

ADD REPLY
0
Entering edit mode

Similar to

pip install -r requires.txt

Or something like this to install the needed packages into the env?

ADD REPLY
0
Entering edit mode

Yes, very similar. Then you activate the environment with source activate inside the directory with

source activate myworkflow

instead of

source bin/activate
ADD REPLY
4
Entering edit mode
9.1 years ago

I happened to be playing around with snakemake this morning and can confirm that you do not need conda, or virtual environments or anything like requires.txt. snakemake can just use your $PATH. Now using all of those things might be convenient when you get a bunch of workflows on a cluster, but I wouldn't start with the added complexity.

ADD COMMENT
0
Entering edit mode

Thanks Devon! This is very helpful!

ADD REPLY

Login before adding your answer.

Traffic: 2351 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6