Hello,
I have a R (RHO_COR.R) script and I would like to create a loop in order to split the jobs on several nodes.
I show the part of the script where I would like to create the loop.
res <- foreach(i = seq_len(nrow(combs)) %dopar% {
G1 <- split[[combs[i,1]]]
G2 <- split[[combs[i,2]]]
dat.i <- cbind(data[,G1], data[,G2])
rho.i <- cor_rho(dat.i)
}
The different results of res (which correspond to submatrices of correlation between OTUs) are stored in several files. combs is a vector which looks like this (but it can change, according to my input file) :
> combs
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 1 5
[5,] 2 3
[6,] 2 4
[7,] 2 5
[8,] 3 4
[9,] 3 5
[10,] 4 5
I would like to send each row of combs seq_len(nrow(combs)
on a node.
This is my slurm code :
#!/bin/bash
#SBATCH -o job-%A_task.out
#SBATCH --job-name=paral_cor
#SBATCH --partition=normal
#SBATCH --time=1-00:00:00
#SBATCH --mem=126G
#SBATCH --cpus-per-task=32
#Set up whatever package we need to run with
module load gcc/8.1.0 openblas/0.3.3 R
# SET UP DIRECTORIES
OUTPUT="$HOME"/$(date +"%Y%m%d")_parallel_nodes_test
mkdir -p "$OUTPUT"
export FILENAME=~/RHO_COR.R
#Run the program
Rscript $FILENAME > "$OUTPUT"
I do not want to use arrays. I wonder if I create an argument which is seq_len(nrow(combs) could be a solution ?
for i in my_argument
do Rscript $FILENAME -i > "$OUTPUT"
done
Thanks
(I asked on stackoverflow but I didn't get any answer back yet..)
You'll need an
srun
in there and$i
rather than-i
.Thanks for your reply. But can I combine
srun
andRscript
into the same loop? And other point, I don't know how to "call" my R variable as an argument into this loop.Edit : I saved my variable into a file that I read in bash.
And I use :
Slurm part
I still struggle to find a way to execute the jobs on several nodes. The loop is executed on only one node and I don't find an issue with
srun
...If you're going to use
%dopar%
then run it in parallel directly in R and don't bother submitting multiple jobs. You'll have to figure out how to do that on your local cluster of course. Otherwise just use an array job or create a loop in your sbatch script calling srun for each value ofi
.That's what I would like to do : calling srun for each value of i .
I tried :
But it is still executed on one node..