Snakemake FastQC MissingOutputFiles ErrorM
1
1
Entering edit mode
5.2 years ago
lasejourny ▴ 10

Hello,

Currently, I’m trying to put together a small 2 step workflow with snakemake for 3 fastq files. But, for some reason, I keep failing at the fastQC step. When I run these rules separately, they produce the results I’m looking for. However, when I run them together I receive a MissingOutputFile error.

This is my snake file…

SAMPLES = ["A", "B", "C"]

rule all:
    input:
        expand("reports/{sample}_fastqc.zip", sample=SAMPLES),
        expand("reports/{sample}_fastqc.html", sample=SAMPLES),
        expand("trimmed_reads/{sample}.trimmed.fastq", sample=SAMPLES),
        expand("trimmed_reads/{sample}.trimmed.fastq", sample=SAMPLES)

rule trimming:
    input:
        "data/{sample}.fastq"
    output:
        "trimmed_reads/{sample}.trimmed.fastq"
    shell:
        "cutadapt -u 15 -q 15 -o {output} {input}"

rule fastqc:
    input:
        "trimmed_reads/{sample}.trimmed.fastq"
    output:
        "reports/{sample}_fastqc.zip",
        "reports/{sample}_fastqc.html"
    shell:
        "fastqc {input} -q -o reports/"

This is the error message:

[Thu Sep 12 17:49:22 2019]

rule clean_fastqc:

input: data/C.fastq
    output: reports/C_fastqc.zip, cleaned_reports/C_fastqc.html

jobid: 7
    wildcards: sample=C

Waiting at most 5 seconds for missing files.

MissingOutputException in line 29 of /Users//workSpace/snakemake-tutorial/fastQC/Snakefile:
Missing files after 5 seconds:

reports/C_fastqc.zip

reports/C_fastqc.html

This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /Users//workSpace/snakemake-tutorial/fastQC/.snakemake/log/2019-09-12T174910.065589.snakemake.log

I tried running this with the latency-wait option but I got the same result. Does anyone have any advice as to why this is occurring? IThe trimming step seems to run fine, but when the fastqc step occurs, that when things fall apart.

SnakeMake FastQC • 4.0k views
ADD COMMENT
1
Entering edit mode

Can you check while FastQC is running whether or not the expected output files are being produced? If so, the problem is most likely that FastQC is giving the output files names (or paths) that differ from what Snakemake is expecting.

ADD REPLY
1
Entering edit mode

your input is "trimmed_reads/{sample}.trimmed.fastq" Should the output then not be something like reports/C.trimmed.html Try the run fastqc from the terminal outside snakemake to check the output file names,

ADD REPLY
1
Entering edit mode
5.2 years ago
bari.ballew ▴ 470

FastQC names output files by taking the original filename, removing the .fastq extension, and replacing the extension with _fastqc.html or _fastqc.zip. So, your output should be reports/{sample}.trimmed_fastq.html and reports/{sample}.trimmed_fastq.zip

Also, you only need to include the final output of your pipeline in the rule all, so yours would be:

rule all:
input:
    expand("reports/{sample}.trimmed_fastqc.zip", sample=SAMPLES),
    expand("reports/{sample}.trimmed_fastqc.html", sample=SAMPLES)
ADD COMMENT

Login before adding your answer.

Traffic: 2629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6