Hi!
I am trying to run a pipeline GATK using the Snakemake. When I form the pipeline gradually, add one rule to the snakemake and the input/otput data are formed in stages, then such a problem does not arise.
But when I run this pipeline, I get the error:
A USER ERROR has occurred: Illegal argument value: Positional arguments were provided ',out_GatherBQSRReports/NSK9.normal.GatherBQSRReports.txt}' but no positional argument is defined for this tool.
and such snakemake error:
/gatk-4.1.9.0/gatk --java-options -Xmx7680m GatherBQSRReports -I -O out_GatherBQSRReports/NSK9.normal.GatherBQSRReports.txt (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
import glob
SAMPLES = glob_wildcards("input_bam/{sample}.bam")
INTERVALS = glob_wildcards("intervals/{interval}.bed")
all_samples_BaseRecalibrator = glob.glob("out_BaseRecalibrator/*-scattered.txt")
rule all:
input: expand("out/{sample}.AddOrReplaceReadGroups.bam", sample = SAMPLES.sample),
expand("out/{sample}.MarkDuplicates.txt", sample = SAMPLES.sample),
expand("out/{sample}.MarkDuplicates.bam", sample = SAMPLES.sample),
expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample),
expand("out_GatherBQSRReports/{sample}.GatherBQSRReports.txt", sample=SAMPLES.sample)
rule gatk_AddOrReplaceReadGroups:
input: "input_bam/{sample}.bam"
output: "out/{sample}.AddOrReplaceReadGroups.bam"
shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx30g"" AddOrReplaceReadGroups -I {input} -O {output} -ID group1 -SM NORMAL -PL illumina -LB lib1 -PU unit1"
rule gatk_MarkDuplicates:
input: rules.gatk_AddOrReplaceReadGroups.output
output: output1="out/{sample}.MarkDuplicates.bam", output2="out/{sample}.MarkDuplicates.txt"
shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx4g"" MarkDuplicates -I {input} -O {output.output1} -M {output.output2} --CREATE_INDEX true"
rule bedtools_genomecov:
input: "input_bam/{sample}.bam"
output: output1="out/{sample}.bedtools_genomecov.genome.covered.bed"
shell: "bedtools genomecov -ibam -I {input} -bg > -O {output.output1}"
rule gatk_BaseRecalibrator:
input: input1="intervals/{interval}.bed", input2="out/{sample}.MarkDuplicates.bam", input3='ref/ref.fa', input4="dbsnp/dbsnp_150.hg38.vcf.gz"
output: "out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt",
shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx7680m"" BaseRecalibrator -L {input.input1} -I {input.input2} -O {output} -R {input.input3} --known-sites {input.input4}"
rule gatk_GatherBQSRReports:
input: all_samples_BaseRecalibrator
output: "out_GatherBQSRReports/{sample}.GatherBQSRReports.txt"
params: all_samples_BaseRecalibrator='-I'.join(all_samples_BaseRecalibrator)
shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I {params.all_samples_BaseRecalibrator} -O {output}"
the value of all_samples_BaseRecalibrator is empty. Furthermore, why do you have two
-I
in this command ?Sorry, I tried to fix the error and added an extra input. However, nothing changes even if the last rule looks like this:
It seems to me that the reason for the error in the snakemake. Because the gatk_GatherBQSRReports works successfully if the folder already contains the results of the previous rules.