Failure to launch OMA in array mode on SLURM cluster
2
0
Entering edit mode
7.1 years ago
jeremias.br ▴ 10

I am working on a CentOS 7.3 cluster running Slurm and I am using OMA 2.1.1. Unfortunately I am not able to run OMA in array mode. I am working with the included files in the ToyExample directory.

Here is the SLURM script I have run:

#!/bin/bash
#SBATCH --time=6:00:00
#SBATCH --job-name="toy_test"
export NR_PROCESSES=2
$HOME/soft/OMA.2.1.1/bin/oma

I launched the script with this command as per the instructions on the OMA site:

sbatch --array=1-2 -N1 toy_launch.sh

Jobs fails with this error:

Starting database conversion and checks...

ERROR: failed to parse anything

When removing the "export NR_PROCESSES=2" part of the call OMA does launch but assumes it is not launched as an array job.

Starting database conversion and checks...

WARNING: not run as a job-array. Will assume it is a single process

This biostar issue seems relevant but has a different error message. I attempted to implement the fix presented in the mentioned issue but it seems there have been some major changes to the lib/Platforms file.

OMA SLURM orthologs • 2.1k views
ADD COMMENT
2
Entering edit mode
7.1 years ago

Hi Jeremias,

I'm one of the OMA developers. This problem is quite strange to me - it's likely to be a problem of the slurm configuration that is different than the once we've so far tested. I would like to understand the problem a bit better to improve slurm support for OMA standalone. There two things I suggest to do:

  1. try to set NR_PROCESSES outside your launch script, so before you do the sbatch command, just export the NR_PROCESSES there.
  2. if it still fails, could you check to what the following environment variables are set in the job. For that, just change the call to oma with the following lines in your script file:

    echo "NR_PROCESSES $NR_PROCESSES"

    echo "SLURM_ARRAY_JOB_ID $SLURM_ARRAY_JOB_ID"

    echo "SLURM_ARRAY_TASK_ID $SLURM_ARRAY_TASK_ID"

    echo "SLURM_ARRAY_TASK_MAX $SLURM_ARRAY_TASK_MAX"

Thanks already for reporting.

Best wishes Adrian

ADD COMMENT
0
Entering edit mode
7.1 years ago
jeremias.br ▴ 10

Hi Adrian,

Thanks for your reply.

1. exporting in the shell used to submit the job did not work. OMA still thinks it is running as a single process.

2. I did two runs one where I included the "export NR_PROCESSES=2" command in the script and one without.

with the export command:

Starting database conversion and checks...

ERROR: failed to parse anything

NR_PROCESSES 2

SLURM_ARRAY_JOB_ID 13799019

SLURM_ARRAY_TASK_ID 1

SLURM_ARRAY_TASK_MAX 2

without the export command it again works as a single process:

Starting database conversion and checks...

WARNING: not run as a job-array. Will assume it is a single process

[...]

Done!!

NR_PROCESSES

SLURM_ARRAY_JOB_ID 13799064

SLURM_ARRAY_TASK_ID 1

SLURM_ARRAY_TASK_MAX 2

ADD COMMENT
0
Entering edit mode

so strange! what version of slurm are you using? the cluster I have access to and where the ToyExample with your submission script works like a charm is a RedHat-Enterprise 6.7 installation with slurm 14.11.10. (you can get the slurm version with slurmctld -V)

ADD REPLY
0
Entering edit mode

I'm using slurm 17.02.3 on CentOS 7.3.1611.

ADD REPLY

Login before adding your answer.

Traffic: 2446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6