Hi all,
I have the set of yeast genomes (136 genomes, ~9Mb and 6000 sequences each) and would like to find group of orthologous genes in them using OMA standalone.
Due to the size of data-set I am trying to arrange the parallelization of OMA run using the cluster with SGE scheduler.
First, I run oma -c
to convert the databases.
Then I submitted the jobs using command qsub -t 1-32 -cwd run_oma.sh
the run_oma.sh contains two lines:
export NR_PROCESSES=32
oma
Then I see that all jobs are running, however, I see very big estimated remaining times which haven't decreased within 6 hours (~ 150000 h). So I am not sure that the run is parallelized properly.
Can anyone help to find out what is happening and how can I speed up the calculation?
Kind regards Marina
I would recommend checking the node your job is running on to see if it's using as many processes as you expect it to be using.
Thank you for the answer, Dave! In the output of
qstat
I see that there are 32 processes running. There are 32 lines like this:job-ID prior name user state submit/start at queue slots ja-task-ID
7533430 0.55500 run_oma.sh - r 03/03/2021 11:10:13 all.q@fr 1 1
...
7533430 0.55500 run_oma.sh - r 03/03/2021 11:10:13 all.q@ze 1 32