Entering edit mode
11 months ago
ema
▴
10
Hello!
I'm currently writing a pipeline with snakemake for exome data. During joint variant calling I need to use GATK's GenomicsDBImport, although I'm unsure how to input all the samples at once. Here's the simplified version of the rule I'm using:
rule GenomicsDBImport:
input:
gvcf = expand("variant_call/{sample}_raw_variants.g.vcf", sample=SAMPLE),
ref = REF
output:
dir = "GDBI_database"
shell:
"""
({GATK} GenomicsDBImport -R {input.ref} -V {input.gvcf} --genomicsdb-workspace-path {output.dir}) 2> {log}
"""
From my understanding, the expand function gives me a list of all the sample names as strings. My question is: can the '-V' argument take a list as input? There's also the option to use a snakemake wrapper, but I'm unfamiliar with that method.
Thanks in advance!
You'd need to take the map-file route, I think. The wrapper does a better job using this line:
You could add that code and try if this works:
The above code is just theoretical, completely untested.