Entering edit mode
3.0 years ago
alexgalvez38
•
0
Hello,
I am trying to create a simple Snakemake workflow and I am having some issues. My file looks like this:
--------------------
ARCHIVE_FILE = 'output.tar.gz'
**a single output file**
OUTPUT_FILE = 'output/{species}.out'
**a single input file**
INPUT_FILE = 'proteins/{species}.fasta'
**Build the list of input files.**
INP = glob_wildcards(INPUT_FILE).species
**The list of all output files**
OUT = expand(OUTPUT_FILE, species=INP)
**pseudo-rule that tries to build everything.
just add all the final outputs that you want built.**
rule all:
input: ARCHIVE_FILE
**hmmsearch**
rule hmm:
input:
cmd='hmmsearch --tblout output_tblout_egf --noali -E 99',
species=INPUT_FILE ,
hmm='hmm/EGF.hmm'
output: OUTPUT_FILE
shell: '{input.cmd} {input.hmm} {input.species} {output}'
**create an archive with all results**
rule create_archive:
input: OUT
output: ARCHIVE_FILE
shell: 'tar -czvf {output} {input}'
This file produces the two following errors:
---------------------
**MissingInputException in line 29 of /home/agalvez/data/workflow-workshop/test/Snakefile:
Missing input files for rule hmm:
hmmsearch --tblout output_tblout_egf --noali -E 99**
---------------------
**MissingInputException in line 49 of /home/agalvez/data/workflow-workshop/test/Snakefile:
Missing input files for rule create_archive:
output/EP00771_Trimastix_marina.out
output/EP00759_Prokinetoplastina_sp_PhF-6.out**
---------------------
It is the first time I ever try to use Snakemake or anything related to Python so I do not understand why this is failing. Any help would be really appreciated. Thanks in advance!
input is a file that should exist, not a command...
Could you try to fix the formatting of the post?
Your code should be formatted like this:
The formatted file looks like this:
I think what @WouterDeCoster was trying to steer you toward doing is delete the line
cmd='hmmsearch --tblout output_tblout_egf --noali -E 99',
and then write the shell command forrule hmm
as:Input is for telling Snakemake files that should exist. That's a shell command & so you just write it out as part of the shell command line of the rule.
I use
input.cmd
in this case because the script is a file I need for the rule to work. This way snakemake is making sure that script file that python runs is available where the other input files are located, and most importantly, reruns the rule if that script is changed. (See aboutwordcount.py
under 'Handling dependencies differently' here.)Your example is more like how
fastqc
andtrimmomatic
are used here. Or howpython
is used in that case I linked to. Or how you usetar
in your archiving rule. Those are software installed into the system path (or environment, in some cases) that run with calls. Those wouldn't be expected to be as subjected to editing as a Python script kept with your data may be.