OMA standalone, parallelization
1
0
Entering edit mode
3.7 years ago
mar.ark.parr ▴ 40

Hi all,

I have the set of yeast genomes (136 genomes, ~9Mb and 6000 sequences each) and would like to find group of orthologous genes in them using OMA standalone.

Due to the size of data-set I am trying to arrange the parallelization of OMA run using the cluster with SGE scheduler.

First, I run oma -c to convert the databases.

Then I submitted the jobs using command qsub -t 1-32 -cwd run_oma.sh the run_oma.sh contains two lines:

export NR_PROCESSES=32
oma

Then I see that all jobs are running, however, I see very big estimated remaining times which haven't decreased within 6 hours (~ 150000 h). So I am not sure that the run is parallelized properly.

Can anyone help to find out what is happening and how can I speed up the calculation?

Kind regards Marina

oma parallelization sge qsub • 1.2k views
ADD COMMENT
1
Entering edit mode

I would recommend checking the node your job is running on to see if it's using as many processes as you expect it to be using.

ADD REPLY
0
Entering edit mode

Thank you for the answer, Dave! In the output of qstat I see that there are 32 processes running. There are 32 lines like this:

job-ID prior name user state submit/start at queue slots ja-task-ID

7533430 0.55500 run_oma.sh - r 03/03/2021 11:10:13 all.q@fr 1 1

...

7533430 0.55500 run_oma.sh - r 03/03/2021 11:10:13 all.q@ze 1 32

ADD REPLY
3
Entering edit mode
3.7 years ago

Hi Marina,

these estimates are rather rough, each process estimates this based on only the work it is doing, so they could be quite a bit off. use oma-status to get a better sense how far you are in the overall process. Also, depending on the performance of the filesystem, it might be wise to regularly run oma-compact to summarize some of the result files.

Cheers Adrian

ADD COMMENT
0
Entering edit mode

Hi Adrian, thank you for the answer! Unfortunately according to oma-status output it seems that the estimations are quite reasonable:

Summary of OMA standalone All-vs-All computations: Nr chunks started: 32 (0.01%) Nr chunks finished: 1151 (0.42%) Nr chunks finished w/o exported genomes: 1151 (0.42%)

ADD REPLY

Login before adding your answer.

Traffic: 1990 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6