bwa parallel on a SLURM cluster
1
5
Entering edit mode
5.3 years ago
little_more ▴ 70

I'm working on a SLURM cluster with NGS data. I trimmed raw reads and was thinking of the best way to align them to the reference genome. I have pairs of reads for a few samples. I wrote a script for parallel bwa:

#SBATCH --cpus-per-task=1
#SBATCH --ntasks=10
#SBATCH --nodes=1

# align with bwa & convert to bam
bwatosam() {
  id=$1
  index=$2
  output=$3/"$id".bam
  fq1=$4/"$id".R1.fq.gz
  fq2=$4/"$id".R2.fq.gz

  bwa mem -t 16 -R '@RG\tID:"$id"\tSM:"$id"\tPL:ILLUMINA\tLB:"$id"_exome' -v 3 -M $index $fq1 $fq2 |
    samtools view -bo $output
};
export -f bwatosam

# run bwatosam in parallel
ls trimmed/*.R1.fq.gz |
 xargs -n 1 basename |
  awk -F ".R1" '{print $1 | "sort -u"}' |
   parallel -j $SLURM_NTASKS "bwatosam {} index.fa alns trimmed"

But I'm not sure if I use the right parameters (#SBATCH) for the job because if I do it without -j:

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=5

# run bwatosam in parallel
ls trimmed/*.R1.fq.gz |
  xargs -n 1 basename |
   awk -F ".R1" '{print $1 | "sort -u"}' |
    parallel "bwatosam {} index.fa alns trimmed"

It works 10 times faster. What number of nodes/cpus/threads should I use?

bwa • 2.7k views
ADD COMMENT
0
Entering edit mode

Have you tried submitting jobs directly to SLURM without the additional complexity of using parallel. On a cluster (with parallel) you are adding complexity for no good reason as far as I see.

ADD REPLY
4
Entering edit mode
5.3 years ago
ATpoint 85k

Depends on the node. I typically run alignments with basically this kind of script (sorry Pierre Lindenbaum, no snakemake yet) on a 72-core node with 192GB RAM, and then use:

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=72
#SBATCH --partition=normal

In this case I would use 4 parallel processes with 16 threads for bwa each. Depends on how much memory your node has. Can you give some details? When using parallel I recommend booking the entire node to ensure you are not interfering with processes from other users.

=> Note that I always book the entire node if running parallel things so I essentially do not care about RAM consumption etc. as long as the node can handle it. If you share the node with others it might be a good idea to task the your admin before if using parallel is allowed on your cluster nodes.

ADD COMMENT
0
Entering edit mode

ATpoint, thanks for answering to my question without endless referring to snakemake! :)

clusters specifications:

376 nodes: each with 2 processors (8 cores each) & 64 GB RAM

144 nodes: each with 2 processors (12 cores each) & 64 GB RAM

do you specify how much memory you need with #SBATCH --mem=...? and do you use -j 4 for using 4 parallel processes? because as I understood by default parallel runs maximum number of parallel processes (that depends on a number of CPUs on the node).

ADD REPLY
1
Entering edit mode

Yes we have to set #SBATCH --mem=... on our cluster as the batch system kills processes using more than the specified amount. I have it at 80GB by default. For your 64GB nodes I would probably run 2 or 3 jobs in parallel, probably 2 as bwa sometimes uses a lot of memory when aligning batches of reads that are somewhat difficult (repetitive) from what I understand. That way you probably avoid running out of memory.

ADD REPLY
0
Entering edit mode

so you suggest using:

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
parallel -j 2...

right? sorry for many questions I just started working with this cluster and I was surprised when parallel -j ran longer than just parallel

ADD REPLY
1
Entering edit mode

No worries. --ntasks-per-node=8 would need to be 34 as this is the total number of threads you are going to use, so 2x16 for bwa plus 2x1 for the samtools view. Without -j it launches jobs on all files in parallel and will grep every available resource on the node, maybe not a good idea when sharing nodeswith others.

I always book the entire node so my advices here might not be adequate when sharing nodes with others, keep that in mind.

ADD REPLY
0
Entering edit mode

I see. so every task will run on a separate CPU and a node needs to have 34 CPUs?

ADD REPLY
1
Entering edit mode

This is how I understand things.

ADD REPLY
0
Entering edit mode

I see. Thanks very much for your help!

ADD REPLY

Login before adding your answer.

Traffic: 1662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6