Hi, I am creating a snakemake pipeline and am having trouble adding an additional wildcard for using different filters. My first rule filters variants and produces two output VCF files, with different filters applied (qfilt or qfiltreg). I would like the second rule to consider {filter} as a wildcard and produce FASTA files for each of these filtered VCFs in parallel. I am a bit confused how to add this wildcard because I run Filter Variants only once to produce two VCFs. Thank you in advance! Best,
# Filter variants.
rule filter_vars:
input:
ref_path='{ref}.fa',
vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_deep.g.vcf.gz',
log:
'results/{samp}/logs/{samp}_{mapper}_{ref}_filt_vcf_stats.txt'
output:
filt_vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_qfilt.vcf.gz',
noppe_vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_qfiltreg.vcf.gz'
shell:
"scripts/filter_vars.sh {input.ref_path} {input.vcf} {output.filt_vcf} > {log}"
# Convert single_sample VCF to fasta.
rule vcf_to_fasta:
input:
ref_path='{ref}.fa',
filt_vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_{filter}.vcf.gz'
output:
fasta='results/{samp}/fasta/{samp}_{mapper}_{ref}_{filter}.fa'
shell:
"scripts/vcf2fasta.sh {input.ref_path} {input.filt_vcf} {output.fasta}"
well I'm not crazy about you using underscores in both the sample names and as delimiters. try using a delimiter that won't confuse the regex. also just update your original question, you don't need to submit an answer.
Okay, thank you for the feedback about the sample names, good point! And re: the original question: I had an issue with incorrect wildcard constraints on the filter argument. Issue solved - thank you again!