I have the following rule in snakemake:
rule low_coverage_contig_reads:
input:
bam="data/processed/bam_files/bam/{sample}_{fraction}.bam.bai",
output:
r1="data/processed/clean_reads/low_cov/low_cov_{sample}_{fraction}_R1.fq.gz",
r2="data/processed/clean_reads/low_cov/low_cov_{sample}_{fraction}_R2.fq.gz"
threads: 8
params:
bam="data/processed/bam_files/bam/{sample}_{fraction}.bam"
log:
log1="logs/{sample}_{fraction}_low_coverage_reads.log",
shell:
"""
(samtools coverage {params.bam} | awk 'NR > 1 && $7 < 10 {{print $1}}' | tr '\\n' ' ' | samtools view -u {params.bam} -b | samtools fastq -@ {threads} -1 {output.r1} -2 {output.r2})2> {log.log1}
"""
The intention behind this rule is to create forward and reverse fq.gz files of the reads that mapped to low coverage contigs (< 10x). However, everytime I run this I get the following error:
Error in rule low_coverage_contig_reads:
jobid: 24
output: data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz, data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz,
log: logs/day7-DO-0-12C-viral_8_low_coverage_reads.log (check log file(s) for error message)
shell:
(samtools coverage data/processed/bam_files/bam/day7-DO-0-12C-viral_8.bam | awk 'NR > 1 && $7 < 10 {print $1}' | tr '\n' ' ' | samtools view -u data/processed/bam_files/bam/day7-DO-0-12C-viral_8.bam -b | samtools fastq -@ 8 -1 data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz -2 data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job low_coverage_contig_reads since they might be corrupted:
data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R1.fq.gz, data/processed/clean_reads/low_cov/low_cov_day7-DO-0-12C-viral_8_R2.fq.gz
I am not sure what is wrong with this rule. Any advice?
You probably should log all steps in the pipe:
instead of
commandA -l 1 --input foo | commandB -a 3 | commandC -o yiiehaaa.vcf 2> {log}
do the more tedious but far more informative
commandA -l 1 --input foo 2> {log} | commandB -a 3 2>> {log} | commandC -o yiiehaaa.vcf 2>> {log}
That's a good suggestion (+1), but it won't capture the non-zero exit code in something like
echo 'foo' | grep bar | sort
and I think stderr should be printed in the snakemake logs anyway. It's unfortunate that snakemake doesn't give a more informative message thanone of the commands exited with non-zero exit code
, I've seen people (including myself) being puzzled by this.What does the log file say?
The only thing the log says is:
Also, I should add that when I run the followning:
in terminal outside of the snakemake pipeline, it works just fine
The
(...) 2>
is not part of your non-snakemake run so that might be the problem.I tried that aswell with no luck
Can you try the command immediately followed by
echo $?
and see what that prints?where would I put "echo $?" in the shell part of my rule?
Not in the rule, just as a command right after the
samtools coverage
command. We are testing the command now, not the snakemake rule.Ok. I will try that. But, as an update, I used this error function that I found here, https://stackoverflow.com/questions/44616073/thread-py-error-snakemake/44625951#44625951:
and then added it to my rule as follows:
and some how that stopped it from throwing the non-zero exit and it ran without any errors, producing the forward and reverse files that I expected... Not sure why that solved the problem