Question

Distribute proportionality calculations on many nodes (SLURM)

0

Entering edit mode

5.6 years ago

pablo ▴ 310

Hello,

I explain what I did (with R) :

1 - I have a matrix of abundances of OTUs (more than 817.000 columns) . I need to compute the proportionality between these OTUs. For the moment, I can split a matrix in submatrices in order to compute the proportionality between each of these submatrices , and then, get the final matrix.

data=matrix(runif(10000), ncol=1000)   #random matrix
data=clr(data)     
ncol<-ncol(data)

rest<-ncol%%100 #100 columns by submatrix
blocks<-ncol%/%100 #10 submatrices

ngroup <- rep(1:blocks, each = 100)
#if (rest>0) ngroup<-c(ngroup,rep(blocks+1,rest))
split <- split(1:ncol, ngroup)

#I get all the combinations between my submatrices

combs <- expand.grid(1:length(split), 1:length(split)) 
combs <- t(apply(combs, 1, sort))
combs <- unique(combs) 
combs <- combs[combs[,1] != combs[,2],] 

res <- foreach(i = seq_len(nrow(combs))) %dopar% { 
  G1 <- split[[combs[i,1]]]
  G2 <- split[[combs[i,2]]]
  dat.i <- cbind(data[,G1], data[,G2])
  rho <- cor_rho(dat.i)  #cor_rho is my  function to compute the proportionality
}

**And then, I get the final matrix :**

resMAT <- matrix(0, ncol(data), ncol(data))

for(i in 1:nrow(combs)){ 
  batch1 <- split[[combs[i,1]]]
  batch2 <- split[[combs[i,2]]]
  patch.i <- c(batch1, batch2) 
  resMAT[patch.i, patch.i] <- res[[i]] 
}

2 - I work with Slurm on a cluster with several nodes. I know that with a node (256G and 32CPUs, in one day, I can compute the proportionality between 60.000 columns). So, I need to use 817.000/60.000 ~ 14 submatrices which gives me (14*13)/2 = 91 combinations (=91 nodes).

3 - I don't know how I could use SLURM to create a SLURM code in order to distribute each combination calculation on a node.

Any advice ?

Bests, Vincent

correlation matrix slurm parallelization • 2.2k views

ADD COMMENT • link updated 5.6 years ago by Jean-Karim Heriche 27k • written 5.6 years ago by pablo ▴ 310

0

Entering edit mode

Is there a reason you haven't asked your cluster admin for the appropriate syntax for your cluster?

ADD REPLY • link 5.6 years ago by Devon Ryan 104k

0

Entering edit mode

Actually, he gave me some pieces of advice but I'm still struggling with the SLURM syntax..

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

Couldn't you just submit the job 14 times (with different input files being your 14 different submatrices)? This would require 14 different SLURM submission scripts, which would probably take less time to write than a Python or Perl script to automatically generate the 14 scripts.

ADD REPLY • link 5.6 years ago by jean.elbers ★ 1.7k

0

Entering edit mode

I would like to stay with my R code, and incorporate it in a SLURM code, if it possible?

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

resMAT[patch.i, patch.i] <- res[[i]] - you're only going to update entries on the diagonal of resMAT

ADD REPLY • link 5.6 years ago by russhh 5.7k

0

Entering edit mode

But it means my final matrix is false?

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

ARG sorry, my mistake

ADD REPLY • link 5.6 years ago by russhh 5.7k

score 1 · Answer 1 · 2019-04-10

1

Entering edit mode

5.6 years ago

Jean-Karim Heriche 27k

This is not really a bioinformatics question but a programming one so might be better addressed on StackOverflow.

I am probably missing something but you may not need to split your matrix into blocks, you could use something like this in R:

library(foreach)
library(doSNOW)

   cluster <- makeCluster(number_of_nodes, type = "SOCK")
   registerDoSNOW(cluster)
  # This is a typical way to parallelize computation of a distance/similarity matrix
   result_matrix <- foreach(j = c(1:n), .combine='cbind') %:%
                          foreach(i = c(1:n), .combine = 'c') %dopar% {
                            cor_rho(data[i], data[j])
                          }
   stopCluster(cluster)

The bash script submitted to SLURM would look like this:

#!/bin/bash

#SBATCH --mail-type=FAIL,END
#SBATCH -N number_of_cpus
#SBATCH -n number_of_nodes
#SBATCH --mem=memory_required

Rscript my_script

ADD COMMENT • link 5.6 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Not to repeat calculations, maybe forloops should be:

foreach(j = 1:(n -1), .combine='cbind') %:%
  foreach(i = (j + 1):n), .combine = 'c') %dopar% {...

ADD REPLY • link 5.6 years ago by zx8754 12k

0

Entering edit mode

Thanks for you answer. I'll try your solution. When you put 1:n, does n means ncol(data) ?

However, if I execute my R script in a SLURM code, does SLURM will "know" how to distribute each combination on different nodes? (if I precise :

#SBATCH --mail-type=FAIL,END
#SBATCH -N 32
#SBATCH -n 91
#SBATCH --mem=250G )

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

Is the default value for -c (--cpus-per-task) set to 1 cpu (-n) on your cluster? If that is true then you are asking for -n 91 cores/tasks from a node with 32 cores.

I generally use -n with -N to specify number of task/cores per node. If you only have 32 cores per node then you may need to specify -n 32 along with -N 91 (nodes), if you want them all to run at the same time? Not sure if you can divide your R-jobs to submit SLURM sub-jobs. Using job arrays may be an option then.

ADD REPLY • link 5.6 years ago by GenoMax 147k

0

Entering edit mode

Yes, it sets up to 1 cpu by default.

Actually, I would like to distribute each combination to one node, like this :

submatrix 1 vs submatrix 2 -> node 1
 submatrix 2 vs submatrix 3 -> node 2 
   ...
 submatrix 13 vs submatrix 14 -> node 91

I don't know if it possible?

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

It should be possible as long as you submit individual jobs. See changes I made to my comment above.

ADD REPLY • link 5.6 years ago by GenoMax 147k

0

Entering edit mode

Actually, I don't mind to run them at the same time. It could be necessary to wait that some jobs end before some others begin?

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

It could be necessary to wait that some jobs end before some others begin?

If that is the case, then you would need to look into --dependency=type:jobid option for sbatch to make sure those tasks don't start until the job (jobid above) they are dependent on finishes successfully.

ADD REPLY • link 5.6 years ago by GenoMax 147k

0

Entering edit mode

Thanks. But is it normal when I execute for example (if I have only 10 combinations -> 10 nodes) sbatch --partition normal --mem-per-cpu=4G --cpus-per-task=16 --nodes=10 my_R_script.sh , it only creates one job and not 10 as I expected?

ADD REPLY • link 5.6 years ago by pablo ▴ 310

0

Entering edit mode

Yes it is normal because as far as SLURM is concerned only job it has been asked to run is my_R_script.sh. Unless this script creates sub-jobs from within that script with multiple sbatch --blah do_this commands SLURM can't run them as separate jobs.

Note: Some clusters may be configured not to allow an existing job to spawn sub-jobs. In that case your only option would be one below.

Other way would be to start the 10 computations independently (if that is possible) by doing:

sbatch --blah my_R_script1.sh
sbatch --blah my_R_script2.sh
..
sbatch --blah my_R_script10.sh

ADD REPLY • link 5.6 years ago by GenoMax 147k

0

Entering edit mode

Actually, I just use my_R_script.sh to execute my R script, that I put in the top the topic . So, it doesn't create any sub-jobs..

If I would like to create the 91 sub-jobs I need, do I let my R script as it is and I create a slurm script to create these sub-jobs?

ADD REPLY • link 5.6 years ago by pablo ▴ 310