Dear all,
Background:
Here a question that could maybe also be a forum post (let me know if it needs to moved to there). I have an old variant calling pipeline that I want to make more efficient and maintainable. My first thoughts are moving it to snakemake so multiple samples and different parts of the pipeline can run simultaneously without doing much programming myself. The issue is that for example I can run 10 small samples or 5 big samples at the same time. Samples are always mixed. I did not started with this pipeline yet but I think that if I say to snakemake that it can use 20 cores that sometimes it will run out of memory and sometimes not.
Question:
Is there a way to detect the percentage of used memory and if this exceeds a certain threshold it will scale down in cores. Or pause one of the rules or samples.
At this moment I am planning to use snakemake but since recently I discovered that nextflow also exist so if you recommend that in terms of memory usage/management let me know.
EDIT:
Based on a comment from Michael Dondrup it is maybe better to restart a rule/job if it fails. Is there a way to do this inside snakemake itself. So catch a memory limit error, pause for a moment en restart the job? For now I can only think of ways to do it in a separated python script. I know snakemake is python but I want to keep the "main" snakefile as clean as possible.
I think I know some possibilities already and now one more with meminfo, I can also use
psutil
from python. But not sure if it is the most "general" way. Also can't think of the right search terms for google.I think it will be better to implement some sort of fail-over management. Take the return value of a job, check for memory errors and re-start, possibly once the load is lower.
Based on your helpful answer I updated my question.