Hello hive mind,
I am having issues with a snakemake script.
What I want it to do is use abricate to look for AMR and resistance genes for each of my samples and then make a summary file for AMR, virulence factors, and plasmids.
What is happening is the script sees that I am missing the needed input files for the summary call and then stops, instead of making them.
EDIT - Here is the error message as well
sean@LEN943:~/Desktop/salmonella/LS21-4590_Sal$ snakemake -s abricate_AMR_VF_snakefile -j1
Building DAG of jobs...
MissingInputException in line 106 of /home/sean/Desktop/salmonella/LS21-4590_Sal/abricate_AMR_VF_snakefile:
Missing input files for rule SummaryAMR:
abricate/LS21-4590-1-Salmonella/LS21-4590-1-Salmonella_ncbi.tab
abricate/LS21-4590-1-Salmonella/LS21-4590-1-Salmonella_argannot.tab
abricate/LS21-4590-1-Salmonella/LS21-4590-1-Salmonella_resfinder.tab
abricate/LS21-4590-1-Salmonella/LS21-4590-1-Salmonella_bacmet2.tab
abricate/LS21-4590-1-Salmonella/LS21-4590-1-Salmonella_megares.tab
abricate/LS21-4590-1-Salmonella/LS21-4590-1-Salmonella_card.tab
I have the script below, however I do not understand why snakemake is not making the needed files.I know this is a user error but I can not see where I have gone astray. Any help is greatly appreciated.
configfile: "config.yaml"
rule all:
input:
expand("SummaryAMR_{sample}.tab", sample = config["names"]),
expand("SummaryVF_{sample}.tab", sample = config["names"]),
expand("plasmidfinder_{sample}.tab", sample = config["names"])
# Finding AMR Genes
rule bacmet2_db:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="bacmet2"
output:
directory("abricate/{sample}/AMR_{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
rule card_db:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="card"
output:
directory("abricate/{sample}/AMR_{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
rule megares_db:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="megares"
output:
directory("abricate/{sample}/AMR_{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
rule ncbi_AMRFinderPlus:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="ncbi"
output:
directory("abricate/{sample}/AMR_{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
rule resfinder_db:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="resfinder"
output:
directory("abricate/{sample}/AMR_{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
rule argannot:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="argannot"
output:
directory("abricate/{sample}/AMR_{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
# Finding virulence factors
rule vfdb:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="vfdb"
output:
directory("abricate/{sample}/{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
rule victors:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="victors"
output:
directory("abricate/{sample}/{sample}_{params.db}.tab")
shell:
"abricate --db {params.db} {input} > {output}"
# Finding Plasmids
rule plasmidfinder:
input:
"{sample}_de_novo/contigs.fasta"
params:
db="plasmidfinder"
output:
"{params.db}_{sample}.tab"
shell:
"abricate --db {params.db} {input} > {output}"
rule SummaryAMR:
input:
argannot="abricate/{sample}/{sample}_argannot.tab",
bacmet2="abricate/{sample}/{sample}_bacmet2.tab",
card="abricate/{sample}/{sample}_card.tab",
megares="abricate/{sample}/{sample}_megares.tab",
ncbi="abricate/{sample}/{sample}_ncbi.tab",
resfinder="abricate/{sample}/{sample}_resfinder.tab",
output:
"SummaryAMR_{sample}.tab"
shell:
"abricate summary {input.argannot} {input.bacmet2} {input.card} {input.megares} \
{input.ncbi} {input.resfinder} > {output}"
rule SummaryVF:
input:
vfdb="abricate/{sample}/{sample}_vfdb.tab",
victors="abricate/{sample}/{sample}_victors.tab"
output:
"SummaryVF_{sample}.tab"
shell:
"abricate summary {input.vfdb} {input.victors} > {output}"
please post the error.
Sorry about that, please find the error message in the main post above. For as helpful as it is.
try running snakemake in dryrun mode and see if it throws any error. Also check if previous steps generate appropriate output. Try setting priorities if there is no syntax error. In summary AMR (
resfinder="abricate/{sample}/{sample}_resfinder.tab",
), i see extra comma at the end, see removing it works.I ran the pipeline in dryrun (
-n
) both before and after deleting that extra comma and received the same error without any extra information. Also it is not making the new directories when its ran in live mode.I will go and read the docs to find out how to set the priorities, I didn't know that was a thing.
My understanding of snakemake was you targeted the final outputs you wanted and then it worked backwards to get there so I'm not sure why it sees the missing files and just stops.
FYI Error from Dry runs