Hi there!
I have a list of paired fastq. I want to filtrate human read from each of these pairs using BBmap tool. here is a functionnal snakemake rule i wrote:
#Removing human reads from fastq
rule clean_fastq:
message:
"Removing human reads from fastq."
input:
unzip_fastq_R1 = rules.fastq_unzip.output.unzip_fastq_R1,
unzip_fastq_R2 = rules.fastq_unzip.output.unzip_fastq_R2,
BBMAP = rules.get_BBmap.output.BBMAP ,
index = rules.create_index.output.index
output:
R1_cleaned = result_repository + "FASTQ_CLEANED/{sample}_R1_cleaned.fastq",
R2_cleaned = result_repository + "FASTQ_CLEANED/{sample}_R2_cleaned.fastq"
params:
path_human= result_repository + "FASTQ_CLEANED/"
shell:
"""
{input.BBMAP} in1={input.unzip_fastq_R1} in2={input.unzip_fastq_R2} \
basename={rules.clean_fastq.params.path_human}{wildcards.sample}_%.fastq outu1={output.R1_cleaned} outu2={output.R2_cleaned} \
path=temp/
"""
Each jobs have a cost of about 20Go RAM and the probleme is that I can only have 32 Go available. I dont know if this is possible for snakemake to execute all jobs from a same rule in a queue, to avoid this memory problem.
If not, I probably should check an other tool to process these fastq. Any ideas? (except bmtagger, I had too much problem with it haha) What would you suggest?
Thx,
Hadrien
Are you running them locally or on a cluster. At least in the latter case you just specify the memory required in the cluster command and let the cluster manager handle it. If you're running locally I expect you have to use
-j 1
to just run one job at a time, since I don't think snakemake can be made aware of local limitations (this would be a good feature request!).Also explicitly add amount of RAM you want to use for
bbtools
by using-XmxNNg
to yourbbtools
command lines.I'm a bit late, but thanks to everyone posted here!
Managing resources for snakemake seems to be a good way to limit multiple jobs. This perfectly solved my problem.
check the green mark on the left to validate dariober's answer.