Greeting oh great and powerful collective hive mind,
The analysis I want to perform is very similar to the viral-ngs pipeline created at Broad for the ebola analysis (Park et al., 2015, Cell 161, 1516-1526, June 18) however I'm not working with ebola or illumina data so I would like to tweak the pipeline for what I am doing.
I'm working on building a pipeline for my viral-ngs analysis and I'm wanting to use snakemake. I'm working through the tutorial and the slides but I have a few question because I'm not a programer by training I'm having some trouble understanding some things.
It seems like snakemake works like or is a virtual python environment and I'm not sure how to get the programs I want to use into the virtual environment. I'm reading the docs on Python packaging for virtual environments and it says to use pip inside the virtual environment so would I use pip inside my snakemake environment?
I know that to create a new workflow it is,
conda -n myworkflow
I feel okay with the making a config file and a Snakefile, I'm just not sure how to set my workflow up to have all the programs and dependencies I need/want in order to run the pipeline.
Isn't it just using your $PATH like everything else?
I do not think so. It seems to have something to do with a requirements file, for example the tutorial has a requirements file that sets up the environment but it is just a list of files or packages. i.e
then you run the create code with the requirement file and it collects the needed packages but I do not understand how it does it or where it pulls from.
Similar to
Or something like this to install the needed packages into the env?
Yes, very similar. Then you activate the environment with source activate inside the directory with
instead of