Entering edit mode
3.8 years ago
kamanovae
▴
100
I am using snakelike to describe the GATK pipeline. Right now my code looks like this:
SAMPLES = glob_wildcards("input_bam/{sample}.bam")
rule all:
input: expand("out/{sample}.AddOrReplaceReadGroups.bam", sample = SAMPLES.sample), expand("out/{sample}.MarkDuplicates.txt", sample = SAMPLES.sample), expand("out/{sample}.MarkDuplicates.bam", sample = SAMPLES.sample), expand("out/{sample}.BaseRecalibrator.txt", sample = SAMPLES.sample)
rule gatk_AddOrReplaceReadGroups:
input: "input_bam/{sample}.bam"
output: "out/{sample}.AddOrReplaceReadGroups.bam"
shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx30g"" AddOrReplaceReadGroups -I {input} -O {output} -ID group1 -SM NORMAL -PL illumina -LB lib1 -PU unit1"
rule gatk_MarkDuplicates:
input: rules.gatk_AddOrReplaceReadGroups.output
output: output1="out/{sample}.MarkDuplicates.bam", output2="out/{sample}.MarkDuplicates.txt"
shell: "./gatk-4.1.9.0/gatk --java-options ""-Xmx4g"" MarkDuplicates -I {input} -O {output.output1} -M {output.output2} --CREATE_INDEX true"
rule gatk_BaseRecalibrator:
input: input1="intervals/0000-scattered.bed", input2=rules.gatk_MarkDuplicates.output.output1, input3='ref/ref.fa', input4='dbsnp/dbsnp_150.hg38.vcf.gz'
output: output1="out/{sample}.BaseRecalibrator.txt"
shell: "./gatk --java-options ""-Xmx7680m"" BaseRecalibrator -L {input1} -I {input2} -O {output.output1} -R {input3} --known-sites {input4}"
instead of one file in the gatk_BaseRecalibrator.output1(intervals/0000-scattered.bed) at the entrance, I want to use all the files that I mess up in the intervals folder. How can I do this correctly?