Snakemake error with 'wildcard'
2
0
Entering edit mode
2.8 years ago
kamanovae ▴ 100

Hi!

I can't figure out what is wrong in my code. Snakemake reports this error

RuleException in line 8 of /storage1/GatkBwaTest/SnakemakeDir/snakefile2:
NameError: The name 'wildcard' is unknown in this context. Please make sure that you defined that variable. Also note that braces not used for variable access have to be escaped by repeating them, i.e. {{print $1}}

Сode looks like this:

(SAMPLES,) = glob_wildcards("../exom/{sample}_L001_R1_001.fastq")
INTERVALS = glob_wildcards("../SnakemakeInput/intervals/{interval}.bed")

rule all:
        input:
                expand("{sample}_R1.fastq", sample=SAMPLES)
rule sort:
        output:
                output1="{sample}_R1.fastq"
        shell:
                " zcat  {wildcard.sample}*R1*.fastq.gz | paste - - - - |  sort -k1,1 -S 30G | tr '\t' '\n' > {output.output1}"

I will be grateful for any answer!

snakemake wildcard • 2.3k views
ADD COMMENT
0
Entering edit mode

what are you trying to do? you define both SAMPLES and INTERVALS, but you do not use them in the code?

ADD REPLY
0
Entering edit mode

I do not use intervals, because this is the first rule from a large pipeline. SAMPLES contains the name of the samples. For example NIST7035_TAAGGCGA_L001_R1_001.fastq.gz I want to use the first rule only for samples with R1 and add the same shell for R2

ADD REPLY
3
Entering edit mode
2.8 years ago

It's a typo. Replace wildcard with wildcards.

ADD COMMENT
0
Entering edit mode

Can someone tell me how to use os.path in snakemake

It's my code

import os

(SAMPLES,) = glob_wildcards("../exom/NIST7035/{sample}_L001_R1_001.fastq.gz")

SAMPLE_PATH=os.path.dirname(config["sample"])
print(SAMPLE_PATH)

rule all:
        input:
                expand("{sample}_R1.fastq", sample=SAMPLES)

rule sort:
        output:
                output1="{sample}_R1.fastq"
        shell:
                " zcat SAMPLE_PATH/{wildcards.sample}*R1*.fastq.gz | paste - - - - |  sort -k1,1 -S 30G | tr '\t' '\n' > {output.output1} "

Now the snakemake does not understand the variable

gzip: SAMPLE_PATH/NIST7035_TAAGGCGA*R1*.fastq.gz: No such file or directory
ADD REPLY
0
Entering edit mode
2.8 years ago
User000 ▴ 710

I am nort sure of what you are doing, but may be something like this.

workdir:"you/work/dir/"

SAMPLES=["NIST7035_TAAGGCGA"]

rule all:
    input:
        expand("{sample}_{r}.fastq", sample=SAMPLES, r=["R1","R2"])
rule sort:
    input:
        fastq="{sample}_L001_{r}_001.fastq.gz"
    output:
        output1="{sample}_{r}.fastq"
    shell:
        """
        zcat {input.fastq} | paste - - - - |  sort -k1,1 -S 30G | tr '\t' '\n' > {output.output1}
        """
ADD COMMENT
0
Entering edit mode

Thanks for the answer, but the input fastq files can be completely different. For example: sample1_L001_R1.fastq; sample2_R1.fastq; sample3_L002_R1_001.fastq In order not to change the code every time, I wanted to use * like in the command line

ADD REPLY
0
Entering edit mode

I suggest you to homogenise the sample names first. I do not think in snakemake you can use *.

ADD REPLY
1
Entering edit mode

i found a solution. This is how the code works.

(SAMPLES,) = glob_wildcards("../exom/NIST7035/{sample}_L001_R1_001.fastq.gz")
rule all:
        input:
                expand("{sample}_R1.fastq", sample=SAMPLES)
rule sort:
        output:
                output1="{sample}_R1.fastq"
        shell:
                " zcat  ../exom/NIST7035/{wildcards.sample}*R1*.fastq.gz | paste - - - - |  sort -k1,1 -S 30G | tr '\t' '\n' > {output.output1} "
ADD REPLY

Login before adding your answer.

Traffic: 1787 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6