Scale cores down when close to memory limit snakemake/nextflow
1
0
Entering edit mode
5.2 years ago
gb ★ 2.2k

Dear all,

Background:

Here a question that could maybe also be a forum post (let me know if it needs to moved to there). I have an old variant calling pipeline that I want to make more efficient and maintainable. My first thoughts are moving it to snakemake so multiple samples and different parts of the pipeline can run simultaneously without doing much programming myself. The issue is that for example I can run 10 small samples or 5 big samples at the same time. Samples are always mixed. I did not started with this pipeline yet but I think that if I say to snakemake that it can use 20 cores that sometimes it will run out of memory and sometimes not.

Question:

Is there a way to detect the percentage of used memory and if this exceeds a certain threshold it will scale down in cores. Or pause one of the rules or samples.

At this moment I am planning to use snakemake but since recently I discovered that nextflow also exist so if you recommend that in terms of memory usage/management let me know.

EDIT:

Based on a comment from Michael Dondrup it is maybe better to restart a rule/job if it fails. Is there a way to do this inside snakemake itself. So catch a memory limit error, pause for a moment en restart the job? For now I can only think of ways to do it in a separated python script. I know snakemake is python but I want to keep the "main" snakefile as clean as possible.

pipeline memory cores • 1.6k views
ADD COMMENT
0
Entering edit mode
5.2 years ago
Michael 55k

On Linux systems you could read /proc/meminfo, before starting processes and only start a new process if enough free mem is available, however, in general it is maybe best to leave memory management to the OS.

 $ cat /proc/meminfo
MemTotal:       2113418852 kB
MemFree:        1226025892 kB
MemAvailable:   2077812608 kB
Buffers:          265728 kB
....
ADD COMMENT
0
Entering edit mode

I think I know some possibilities already and now one more with meminfo, I can also use psutil from python. But not sure if it is the most "general" way. Also can't think of the right search terms for google.

ADD REPLY
1
Entering edit mode

I think it will be better to implement some sort of fail-over management. Take the return value of a job, check for memory errors and re-start, possibly once the load is lower.

ADD REPLY
0
Entering edit mode

Based on your helpful answer I updated my question.

ADD REPLY

Login before adding your answer.

Traffic: 1815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6