snakemake wildcard in shell
1
1
Entering edit mode
22 months ago
samuel ▴ 260

I'm a newbie at snakemake. I'm trying to implement the GATK FastqToSam in a rule.

I have the following and it works if I hard code the samplename into the shell part but I was wanting to get the samplename from the config file. I tried using the params option like so:

configfile: "config.yaml"

fc = config["flowcell"]
index = config["index"]
samplename = config["sn"]
lane=["L01","L02","L03","L04"]


rule all:
    input:
        expand(["result/gatk4/{fc}_{lane}_{index}_fastqtosam.bam"], fc = fc, lane = lane, index = index)



rule fastqToSam:
    input:
        trimmed1 = "result/trimming/{fc}_{lane}_{index}_1_500k_trimmed.fq.gz",
        trimmed2 = "result/trimming/{fc}_{lane}_{index}_2_500k_trimmed.fq.gz"
    output:
        output="result/gatk4/{fc}_{lane}_{index}_fastqtosam.bam"
    log:
        "result/logs/fastqtosam/{fc}_{lane}_{index}.out"
    benchmark:
       "result/benchmarks/{fc}_{lane}_{index}.out"
    container:
        config["containers"]["gatk4"]
    params:
        RG="{fc}_{lane}",
        sn="{samplename}"
    threads: 8
    shell:
        """
        gatk --java-options "-Xmx8G" FastqToSam
        FASTQ={input.trimmed1}
        FASTQ2={input.trimmed2}
        OUTPUT={output.output}
        QUALITY_FORMAT=Standard
        READ_GROUP_NAME={params.RG}
        SAMPLE_NAME={params.sn}
        LIBRARY_NAME=PCRFree
        PLATFORM=MGI
        SEQUENCING_CENTER=CG 2>{log}
        """

But I get the error:

    WildcardError in line ** of snakefile:
    Wildcards in params cannot be determined from output files. Note that you have to use a function to deactivate automatic wildcard expansion in params strings, e.g., `lambda wildcards: '{test}'`. Also see https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#non-file-parameters-for-rules:
    'samplename'

Is there a way of getting the samplename from the config file and using it in the shell part of the snakemake rule?

snakemake • 1.2k views
ADD COMMENT
3
Entering edit mode
22 months ago
hugo.avila ▴ 530

This happens because you used the Snakemake notation for wildcards "{samplename}". Since you've already loaded the configuration file as a dictionary and stored the sample name value in the "samplename" variable, you can simply pass it as is:

    params:
        RG="{fc}_{lane}",
        sn=samplename  # <------ HERE
    threads: 8
    shell:
        """
        gatk --java-options '-Xmx8G' FastqToSam #  <--- I also fix some quotation here the might bring you some problems. 
        FASTQ={input.trimmed1}
        FASTQ2={input.trimmed2}
        OUTPUT={output.output}
        QUALITY_FORMAT=Standard
        READ_GROUP_NAME={params.RG}
        SAMPLE_NAME={params.sn}
        LIBRARY_NAME=PCRFree
        PLATFORM=MGI
        SEQUENCING_CENTER=CG 2>{log}
        """

If I'm correct, this should work.

However, reading input parameters from the configuration file can be somewhat inflexible. I recommend following the Snakemake documentation and reading input information from an input table instead.

ADD COMMENT
1
Entering edit mode

thanks hugo.avila I don't think I can accept your answer as it's a comment? Much appreciated!

ADD REPLY
2
Entering edit mode

Moved to answer. You can accept.

ADD REPLY
0
Entering edit mode

Your welcome ;)

No worries, sometimes the mods add comments as an accepted answer if they see that the problem was soved. I thought that maybe i have gotten you question wrong so i submitted my answer as a comment, my bad.

ADD REPLY

Login before adding your answer.

Traffic: 1744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6