Question

Snakemake rule error

0

Entering edit mode

16 months ago

tjroger86 ▴ 10

I have the following rule in snakemake:

rule low_coverage_contig_reads:
    input:
        bam="data/processed/bam_files/bam/{sample}_{fraction}.bam.bai",
    output:
        r1="data/processed/clean_reads/low_cov/low_cov_{sample}_{fraction}_R1.fq.gz",
        r2="data/processed/clean_reads/low_cov/low_cov_{sample}_{fraction}_R2.fq.gz"
    threads: 8
    params:
        bam="data/processed/bam_files/bam/{sample}_{fraction}.bam"
    log:
        log1="logs/{sample}_{fraction}_low_coverage_reads.log",
    shell:
        """
            (samtools coverage {params.bam} | awk 'NR > 1 && $7 < 10 {{print $1}}' | tr '\\n' ' ' | samtools view -u {params.bam} -b | samtools fastq -@ {threads} -1 {output.r1} -2 {output.r2})2> {log.log1}
        """

The intention behind this rule is to create forward and reverse fq.gz files of the reads that mapped to low coverage contigs (< 10x). However, everytime I run this I get the following error:

Error in rule low_coverage_contig_reads:
    jobid: 24
    output: data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz, data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz, 
    log: logs/day7-DO-0-12C-viral_8_low_coverage_reads.log (check log file(s) for error message)
    shell:

            (samtools coverage data/processed/bam_files/bam/day7-DO-0-12C-viral_8.bam | awk 'NR > 1 && $7 < 10 {print $1}' | tr '\n' ' ' | samtools view -u data/processed/bam_files/bam/day7-DO-0-12C-viral_8.bam -b | samtools fastq -@ 8 -1 data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz -2 data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job low_coverage_contig_reads since they might be corrupted:
data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz, data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz

I am not sure what is wrong with this rule. Any advice?

snakemake python • 2.0k views

ADD COMMENT • link updated 16 months ago by dariober 15k • written 16 months ago by tjroger86 ▴ 10

2

Entering edit mode

You probably should log all steps in the pipe:

instead of
commandA -l 1 --input foo | commandB -a 3 | commandC -o yiiehaaa.vcf 2> {log}

do the more tedious but far more informative
commandA -l 1 --input foo 2> {log} | commandB -a 3 2>> {log} | commandC -o yiiehaaa.vcf 2>> {log}

ADD REPLY • link 16 months ago by WouterDeCoster 47k

0

Entering edit mode

That's a good suggestion (+1), but it won't capture the non-zero exit code in something like echo 'foo' | grep bar | sort and I think stderr should be printed in the snakemake logs anyway. It's unfortunate that snakemake doesn't give a more informative message than one of the commands exited with non-zero exit code, I've seen people (including myself) being puzzled by this.

ADD REPLY • link 16 months ago by dariober 15k

0

Entering edit mode

What does the log file say?

ADD REPLY • link 16 months ago by Ram 45k

0

Entering edit mode

The only thing the log says is:

[M::bam2fq_mainloop] discarded 0 singletons
[M::bam2fq_mainloop] processed 6788851 reads

ADD REPLY • link 16 months ago by tjroger86 ▴ 10

0

Entering edit mode

Also, I should add that when I run the followning:

samtools coverage data/processed/bam_files/bam/day7-DO-0-12C-viral_8.bam | awk 'NR > 1 && $7 < 10 {print $1}' | tr '\n' ' ' | samtools view -u data/processed/bam_files/bam/day7-DO-0-12C-viral_8.bam -b | samtools fastq -@ 8 -1 data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz -2 data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz

in terminal outside of the snakemake pipeline, it works just fine

ADD REPLY • link 16 months ago by tjroger86 ▴ 10

0

Entering edit mode

The (...) 2> is not part of your non-snakemake run so that might be the problem.

ADD REPLY • link 16 months ago by Ram 45k

0

Entering edit mode

I tried that aswell with no luck

ADD REPLY • link 16 months ago by tjroger86 ▴ 10

0

Entering edit mode

Can you try the command immediately followed by echo $? and see what that prints?

ADD REPLY • link 16 months ago by Ram 45k

0

Entering edit mode

where would I put "echo $?" in the shell part of my rule?

ADD REPLY • link 16 months ago by tjroger86 ▴ 10

0

Entering edit mode

Not in the rule, just as a command right after the samtools coverage command. We are testing the command now, not the snakemake rule.

ADD REPLY • link 16 months ago by Ram 45k

0

Entering edit mode

Ok. I will try that. But, as an update, I used this error function that I found here, https://stackoverflow.com/questions/44616073/thread-py-error-snakemake/44625951#44625951:

shell.prefix("""
# http://linuxcommand.org/wss0150.php
PROGNAME=$(basename $0)

function error_exit
{{
#   ----------------------------------------------------------------
#   Function for exit due to fatal program error
#       Accepts 1 argument:
#           string containing descriptive error message
#   ----------------------------------------------------------------
    echo "${{PROGNAME}}: ${{1:-"Unknown Error"}}" 1>&2
    exit 1
}}
""")

and then added it to my rule as follows:

rule low_coverage_contig_reads:
    input:
        bam="data/processed/bam_files/bam/{sample}_{fraction}.bam.bai",
    output:
        r1="data/processed/clean_reads/low_cov/low_cov_{sample}_{fraction}_R1.fq.gz",
        r2="data/processed/clean_reads/low_cov/low_cov_{sample}_{fraction}_R2.fq.gz"

    threads: 8
    params:
        bam="data/processed/bam_files/bam/{sample}_{fraction}.bam"
    log:
        log1="logs/{sample}_{fraction}_low_coverage_reads.log"
    shell:
        """
            (samtools coverage {params.bam} | awk 'NR > 1 && $7 < 10 {{print $1}}' | tr '\\n' ' ' | samtools view -u {params.bam} -b | samtools fastq -@ {threads} -1 {output.r1} -2 {output.r2} || error_exit "samtools failed")2> {log.log1}
        """

and some how that stopped it from throwing the non-zero exit and it ran without any errors, producing the forward and reverse files that I expected... Not sure why that solved the problem

ADD REPLY • link 16 months ago by tjroger86 ▴ 10

score 2 · Answer 1 · 2023-12-12

To find out where the problem occurs in the chain of commands you could execute the pipes on your terminal followed by echo "${PIPESTATUS[@]}". In contrast to echo $?, you get the exit code for all the commands in the pipe, not for just the last one. For example:

echo '1 2 3' | foo | sort 
echo "${PIPESTATUS[@]}"

Gives you:

0 127 0

while echo $? would give you 0.

For the record: I submitted this issue to the snakemake repository: https://github.com/snakemake/snakemake/issues/2544