snakemake
1
0
Entering edit mode
2.6 years ago

Hi

I'm having an issue in my snakemake file it is running but there is an issue where it shows missing output exception

SRA,FRR = glob_wildcards("rawReads/{sra}_{frr}.fastq.gz")

rule all:

    input:
        expand("rawQC/{sra}_{frr}_fastqc.{extension}", sra=SRA, frr=FRR,extension=["gz","html"]),
        expand("trimmedreads{sra}_fastq.html", sra=SRA),

rule rawFastqc:

    input:
        rawread="rawReads/{sra}_{frr}.fastq.gz",
    output:
        gz="rawQC/{sra}_{frr}_fastqc.gz",
        html="rawQC/{sra}_{frr}_fastqc.html",
    threads:
        1
    params:
        path="rawQC/",
    shell:
        """
        fastqc {input.rawread} --threads {threads} -o {params.path}
        """

rule fastp:

     input:
         read1="rawReads/{sra}_1.fastq.gz",
         read2="rawReads/{sra}_2.fastq.gz",
     output:
         read1="trimmedreads/{sra}_1P.fastq.gz",
         read2="trimmedreads/{sra}_2P.fastq.gz",
         report_html= "trimmedreads{sra}_fastq.html",
     threads: 
        4
     shell:
         """
         fastp --thread {threads} -i {input.read1} -I {input.read2} -o {output.read1} -O {output.read2} -h {output.report_html}
         """

and this is the error

MissingOutputException in line 10 of /mnt/d/snakemake/snakefile.py:                                                                                                     Job Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:                rawQC/SRR8571278_2_fastqc.gz completed successfully, but some output files are missing. 2                                                                               Removing output files of failed job rawFastqc since they might be corrupted:

and I'm trying to solve the error but I'm confused on what is missing in the code

fastp snakemake • 1.3k views
ADD COMMENT
0
Entering edit mode

it sounds like the files are not found, fastqc runs with invalid input, errors out in some way, and evidently the output files are also missing

Personal observation: in general for examples like this snake make is overkill, I would suggest learning something simpler first, bash scripting, GNU parallel, simple Makefiles etc.

ADD REPLY
0
Entering edit mode

This worked for me. Code is copy/pasted from your code:

directory:

$ tree rawReads

 rawReads
 ├── sra_frr.fastq.gz
 └── test.sm

0 directories, 2 files

snakemake file:

$ cat rawReads/test.sm

SRA,FRR = glob_wildcards("rawREADS/{sra}_{frr}.fastq.gz")
print (SRA)
print(FRR)

rule all:
    input:
            expand("rawQC/{sra}_{frr}_fastqc.{extension}", sra=SRA, frr=FRR,extension=["gz","html"]),

rule rawFastQC:
    input:
         rawread="rawREADS/{sra}_{frr}.fastq.gz"
    output:
            gz="rawQC/{sra}_{frr}_fastqc.gz",
            html="rawQC/{sra}_{frr}_fastqc.html"
    threads:
            1
    params:
            path="rawQC/"
    shell:
            """
            fastqc {input.rawread} --threads {threads} -o {params.path}
            """

code run:

$ snakemake -ns rawReads/test.sm -j 1

['sra']
['frr']
Building DAG of jobs...
Job stats:
job          count    min threads    max threads
---------  -------  -------------  -------------
all              1              1              1
rawFastQC        1              1              1
total            2              1              1


[Mon May  2 12:02:42 2022]
rule rawFastQC:
    input: rawREADS/sra_frr.fastq.gz
    output: rawQC/sra_frr_fastqc.gz, rawQC/sra_frr_fastqc.html
    jobid: 1
    wildcards: sra=sra, frr=frr
    resources: tmpdir=/var/folders/w6/z_3lbbdx0j7_s1wx3bt31n6c0000gn/T


[Mon May  2 12:02:42 2022]
localrule all:
    input: rawQC/sra_frr_fastqc.gz, rawQC/sra_frr_fastqc.html
    jobid: 0
    resources: tmpdir=/var/folders/w6/z_3lbbdx0j7_s1wx3bt31n6c0000gn/T

Job stats:
job          count    min threads    max threads
---------  -------  -------------  -------------
all              1              1              1
rawFastQC        1              1              1
total            2              1              1

This was a dry-run (flag -n). The order of jobs does not reflect the order of execution
ADD REPLY
0
Entering edit mode

Hii thanks for the suggestion but in dry run it is working but whenever I try to run it the whole analysis is completing but in last again the same error occurs

ADD REPLY
1
Entering edit mode
2.6 years ago

There is fundamental mistake. Fastqc doesn't output gz. It outputs zip and html. That is why it's failing.

Try this script:

SRA,FRR = glob_wildcards("rawREADS/{sra}_{frr}.fastq.gz")
print (SRA)
print(FRR)

rule all:
    input:
            expand("rawQC/{sra}_{frr}_fastqc.{extension}", sra=SRA, frr=FRR,extension=["zip","html"]),

rule rawFastQC:
    input:
         rawread="rawREADS/{sra}_{frr}.fastq.gz"
    output:
            zip="rawQC/{sra}_{frr}_fastqc.zip",
            html="rawQC/{sra}_{frr}_fastqc.html"
    threads:
            1
    params:
            path="rawQC/"
    shell:
            """
            fastqc {input.rawread} --threads {threads} -o {params.path}
    """

beware of copy/pasting tabs.

ADD COMMENT
0
Entering edit mode

Thanks for making me understand it works now I thought that as my input is .gz format then the output is also like that

Really big thank you you cleared my concept of how snakemake works

ADD REPLY

Login before adding your answer.

Traffic: 2323 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6