How to submit multithreaded jobs on SLURM and determine number of threads?
1
2
Entering edit mode
6.3 years ago
steve ★ 3.5k

Really simple question but for some reason I cannot find a simple answer to this.

We are moving from SGE to SLURM on our HPC. On SGE, we often submit a job with a range of threads, then inside the job, read the number of threads we got and use that to configure multithreaded programs. Example:

$ cat test.sh
#!/bin/bash

echo "got $NSLOTS cpu slots"

$ qsub -pe threaded 4-16 test.sh
Your job 5597119 ("test.sh") has been submitted

$ cat test.sh.o5597119
got 15 cpu slots

What is the SLURM equivalent of this?

I have tried a million variations of sbatch test.sh --cpus-per-task x --ntasks y across different queues, but nothing seems to work. Its also not clear how to request a range of threads, nor is it clear how to figure out how many you got; I've seen examples online that use $SLURM_NTASKS and $SLURM_CPUS_PER_TASK but neither of those variables exist inside any of my jobs. The SLURM docs have been pretty unhelpful.

slurm • 11k views
ADD COMMENT
1
Entering edit mode

Option 1: Simple command line SLURM equivalent is:

sbatch -p name_of_paritition -n integer_#_cpu -N 1 (if you want to keep them on one physical server) --mem=NNg -t 1-0 (or HH:MM format) -o file.out -e file.err your_script.sh

Option 2: If you want to submit as a script. Save the following in a file script.sh.

#!/bin/bash

#SBATCH -p partition
#SBATCH -N 1
#SBATCH -n integer_#_cpu
#SBATCH --mem=NNg
#SBATCH -t 5-00:00:00

bbmap.sh -Xmx20g in1=file_R1.fq.gz in2=file_R1.fq.gz......

Then submit by doing

sbatch script.sh
ADD REPLY
0
Entering edit mode

Thanks I'm using this with Nextflow so I'll need the sbatch format. However this doesn't show how to use a range of threads, or how your script can know how many threads it was given

ADD REPLY
0
Entering edit mode

I am not sure if there is a range option available with SLURM (someone else might chime in on that). If you are looking for job arrays then see the example here. -n is the number of cores assigned to the script. Make sure that number matches of exceeds what you are specifying within your script.

ADD REPLY
0
Entering edit mode

I see. Typically, the point is not to specify in the script how many threads, but instead get the number of threads from the amount allocated by the scheduler.

ADD REPLY
0
Entering edit mode

I have always specified a number of cores I want with SGE/LSF/SLURM over the years. There is always a default allocation of cores available per user. Jobs will dispatch right away until that allocation is used up and the rest then pend in queue. Most fair use clusters always reserve enough cores so those submitting small/new jobs will always have slots available. I have never depended on the numbers of cores assigned by a cluster dynamically.

You may wish to talk with your admins to see if the policy you are used to still applies in case of SLURM.

ADD REPLY
0
Entering edit mode

The issue this targets is the situation where a node has e.g. 7 slots available and you requested 8, I'd rather just take the 7 than wait for a full 8 to open up. Similarly when you have a CPU slot quota that does not divide evenly across your resource allocation, let the last jobs in your queue squeeze in with sightly fewer threads, since it could take hours for the other jobs to complete and make room for the full allotment.

ADD REPLY
0
Entering edit mode

While you wait for someone to respond it would be best to check with your local sys admins.

I have moved my answer to a comment since it is not satisfying your exact need for now. Having had access to large clusters for many years I am spoiled.

ADD REPLY
1
Entering edit mode

For SLURM, I think the only resource you can ask a range is --nodes. Regarding the environment variables being set ot not, it depends how you run your scripts. For example, the following script:

#!/bin/bash
echo "$SLURM_JOB_ID jobid, $SLURM_JOB_NUM_NODES nodes."
echo "SLURM says $SLURM_NTASKS ntasks" 
echo "SLURM says $SLURM_CPUS_PER_TASK cpus per task"

If I run it with

sbatch -p short --nodes=1-4 --ntasks=1 --mem 1g -t 00:30 -o res1.out  test.sh

Then cat res1.out:

12396, 1 nodes.

SLURM says 1 ntasks

and cpus per task

Now if I run it with:

sbatch -p short --nodes=1-4 --ntasks=1 --cpus-per-task=10 --mem 1g -t 00:30 -o res2.out  test.sh

Then cat res2.out:

12397, 1 nodes.

SLURM says 1 ntasks

and 10 cpus per task

ADD REPLY
2
Entering edit mode
6.3 years ago

You can't use a range of threads, you must state how many you need. The program you run in the script only knows how many threads you've allocated if you tell it. It is completely possible to lie to slurm and say:

#SBATCH -c 1
bowtie2 -p 20 ...

Jobs won't be killed for using too much CPU (it's often impossible to predict exactly how many will get used), at least on our cluster, rather what you specify with SBATCH -c X is removed from the list of allocatable resources.

Make sure to specify #SBATCH --ntasks-per-node=1 to prevent tasks from getting split across nodes (unless your processes can actually handle this...few can in bioinformatics).

ADD COMMENT
0
Entering edit mode

so how should I specify the slurm parameters if i want to have, say, STAR running using 20 threads?

Let's just use srun instead of a shell script & sbatch combo for the simplicity as it summarizes all the parameters in a one-liner:

srun --ntasks-per-node=1 --cpus-per-task=20 <...> STAR --runThreadN 20 <...> ?

ADD REPLY

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6