I’m moving from a HPC that had slurm for job scheduling to one with no job management program. Is there a program that can sorta manage lots of parallel jobs kinda like personal slurm? I’m using screen but it’s kinda annoying, I’d like to start a lot of jobs and log off and come back later.
Do you mean to say that it is a "free for all" situation i.e. you could occupy the entire cluster with your jobs without consideration to others? Surely some form of control over user processes is in place e.g. via limits per user account /etc/security/limits.conf?
Hi Devon, would you mind expanding on that? I mean, I know you can install slurm on a single node, but is it something reasonably doable by someone without admin rights and without investing a fair bit of effort maintaining slurm over time?
Depending on how much effort you want to invest, you could run your program(s) via snakemake and let snakemake handle parallelism and job dependencies. There will be a learning curve but in my opionion is totally worth it.
Alternatively, with xargs (available on *nix systems) you can run programs in parallel. For example, run the list of bash scripts captured by ls, run up to 8 in parallel:
ls job_number.*.sh | xargs -P 8 -n 1 bash
parallel is an alternative to xargs, arguably more powerful than xargs.
You want the task spooler, a "personal batch system" that doesn't require root privileges. You define how many threads your system can run in parallel and then just submit jobs to the queue. The spooler starts the jobs as soon as enough cores are free.
It's deliberately simple and only considers parallel threads, not memory or other resources. It doesn't send emails by default, but you can script what should happen when a task finishes (including sending an email). As a personal "submit and forget" replacement for a full-fledged job manager, I liked it a lot.
Do you mean to say that it is a "free for all" situation i.e. you could occupy the entire cluster with your jobs without consideration to others? Surely some form of control over user processes is in place e.g. via limits per user account
/etc/security/limits.conf
?You can setup slurm on a single node if you'd like.
Hi Devon, would you mind expanding on that? I mean, I know you can install slurm on a single node, but is it something reasonably doable by someone without admin rights and without investing a fair bit of effort maintaining slurm over time?
No, you'd need admin rights :(