Snakemake temp()
0
0
Entering edit mode
2.9 years ago
kstangline ▴ 80

I have Snakemake hooked up to an S3 account, and I'm wanting to delete certain temp() files after processing our pipeline.

I have a rule that designates certain files as temp(). Below is an example of one:

#Split rep element mapped bam file into subfiles rule split_rep_bam:   input:
    'rep_element_pipeline/{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam'   output:
    temp('rep_element_pipeline/AA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/AC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/AG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/AN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/AT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/CA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/CC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/CG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/CN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/CT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/GA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/GC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/GG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/GN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/GT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/NA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/NC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/NG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/NN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/NT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/TA.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/TC.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/TG.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/TN.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp'),
    temp('rep_element_pipeline/TT.{sample}.fastq.gz.mapped_vs_' + config["ref"]["bt2_index"] + '.sam.tmp')   conda:
    '../envs/rep_element.yaml'   params:
    fp=full_path   shell:
    'perl ../scripts/split_bam_to_subfiles.pl {input[0]} '
    '{params.fp}/rep_element_pipeline/'

I've been running Snakemake as such:

snakemake --default-remote-provider S3 --default-remote-prefix '$s3' --use-conda --cores 32 --rerun-incomplete --printshellcmds --delete-temp-output

Where I've specified the --delete-temp-output option.

When running this rule, it appears snakemake is deleting these files. However, these files still persist in S3. Does anyone know why these files aren't being deleted in S3? The temp() files appear to be deleted locally on my machine, but are uploaded to S3.

Building DAG of jobs...
Deleting 20211222-rp030/rep_element_pipeline/AA.APP-01_rep_1_Ribo_R1.fastq.gz.mapped_vs_bt2_hg38.sam.tmp
Deleting 20211222-rp030/rep_element_pipeline/AC.APP-01_rep_1_Ribo_R1.fastq.gz.mapped_vs_bt2_hg38.sam.tmp

I would really just like for my temp() files not to be uploaded to S3, as they take up a lot of disk space.

snakemake • 1.5k views
ADD COMMENT
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6