Pass a python dictionary as a snakemake variable (input and output)
0
0
Entering edit mode
2.4 years ago
Ivan ▴ 60

Within my snakemake file I have a function called SNAKEMAKE_OUTPUT. Calling this function gives me three variables INPUT, OUTPUT,COMMANDS. The first variable is a list, the latter two are dictionaries whose keys are within the first variable. For example.

SAMPLES = ["Mortimer"]
RESULTS = {"Mortimer" : ["Mort_out1.x","Mort_out2.x",...]
COMMANDS = {"Mortimer" : ["bash_command_1","bash_command_2"...]
all_outputs = [x for xs in list(DOWNLOAD_OUTPUTS.values()) for x in xs]

(The last line is here to flatten a list of lists into a list)

The idea behind this arrangement is that I can use it within a snakemake rule, in the following example:

rule all:
   input: 
      list(RESULTS.values())

rule randomname:
   run:
      sample = wildcards.sample
      commands = COMMANDS[sample]
      for command in commands: os.system(command)
   output:
      RESULTS[wildcards.sample]

The reason I want to do it this way is because that the commands needed to get a particular sample's output and its output might not share the same structure with the rest of the samples. For example, with one sample I can get 2 outputs, and with some other sample I can get 3 or 4 outputs.

What am I doing wrong?

EDIT 1:

I'm including a full working version of what I want to do here. In a nutshell, download files from SRA archive, and then subject to an aligner like bwa or bowtie2. The thing that irks me is that I need to specify "raw_samples/{sample}_1.fastq" when I have this information in a Python dictionary. I'd really really like it if I didn't have to manually tweak that.

 SAMPLES, DOWNLOAD_OUTPUTS, DOWNLOAD_COMMANDS = SNAKEMAKE_OUTPUT("linker.csv",0,2)

#MAIN RULE
rule all:
    input:
        #expand("raw_samples/{sample}_{i}.fastq",sample=SAMPLES,i = range(1,2)),
        expand("processed_files/{sample}.bam",sample=SAMPLES)

rule download_samples:
    output:
        "raw_samples/{sample}_1.fastq",
        "raw_samples/{sample}_2.fastq"
    run:
        sample_name = wildcards.sample
        commands = DOWNLOAD_COMMANDS[sample_name]
        for command in commands:
            os.system(command)

`rule align_stuff`
   "generic_alignment"
python snakemake widlcards • 2.3k views
ADD COMMENT
0
Entering edit mode

Hello, what error message do you get? Plus, since you do not affect any value to wildcard.sample, your snakefile will not run...

As a general remark Snakemake is meant to run pipelines. It therefore expects a list of well defined operations, with outputs and inputs clearly expected, that follow one another. Is there something following your rule randomname? Because from your code, it feels like you are using it to run different operations in parallel. In this case, using the Parallel python module would be more appropriate (and easier I think).

ADD REPLY
0
Entering edit mode

Yes, I have something following the rule randomname - I just did a code snippet here in order to not clutter the message. The error message is

"NameError in line 18 of /storage/home/ipokrova/CraigLowe_BWA/testing_downloadsnakemake/snakefile:
name 'wildcards' is not defined"

I know that this is sort of unorthodox for a snakemake file, but I actually have a list of well defined outputs, inputs, and operations, it's just that they're in the form of Python variables. I'm editing the OP for more clarity.

ADD REPLY
0
Entering edit mode

Thanks, it is a lot clearer :) just using expand("raw_samples/{sample}_1.fastq",sample=SAMPLES) as an input to your rule all doesn't work?

ADD REPLY
0
Entering edit mode

No, the second example is the example that works. I'm just including {sample}.bam to show that I'm using snakemake for downstream analysis. What irks me is that I need to "manually" input output for my rule download_samples, when I want to just read them from a python variable.

ADD REPLY

Login before adding your answer.

Traffic: 1728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6