If someone has already manage to run cellRanger with Slurm, maybe you can help me :
Until now, I was running CellRanger on a cluster, on a single node of 512Go RAM.
For a dataset of 6,7k cells and 90k reads/cell, cellranger count
function takes 7h30.
I can dispose of 5 nodes of 512Go RAM.
I tried the slurm template proposed by 10x (even if it's not officially support), jobs are submited by Martian
#!/usr/bin/env bash
#SBATCH -J __MRO_JOB_NAME__
#SBATCH -p big
#SBATCH --export=ALL
#SBATCH --nodes=1 --ntasks-per-node=__MRO_THREADS__
#SBATCH --signal=2
#SBATCH --no-requeue
### Alternatively: --ntasks=1 --cpus-per-task=__MRO_THREADS__
### Consult with your cluster administrators to find the combination that
### works best for single-node, multi-threaded applications on your system.
#SBATCH --mem=__MRO_MEM_GB__G
#SBATCH -o __MRO_STDOUT__
#SBATCH -e __MRO_STDERR__
__MRO_CMD__
When I run the same count
function on the same dataset but with slurm template as following :
cellranger count --transcriptome=refdata-cellranger-mm10-3.0.0 --fastqs=./indepth_C07_MissingLibrary_1_HL5G3BBXX, ./indepth_C07_MissingLibrary_1_HNNWNBBXX --jobmode=./martian-cs/v3.2.3/jobmanagers/slurm.template`
I check the jobs submited : Martian submit 64 jobs by 64 jobs to the cluster (to slurm) and one job by node is running (this is how the cluster work, I can submit only one job by node because it uses all CPUs of the node = 16 CPUs).
So instead of having 1 node busy, I parralelize on 5 nodes. BUT :
it takes 12h42 instead of 7h30.
I checked the processus running and number of CPUs used : When I use a single node, the 1st process called read_chunks
use 1-4CPUs, the 2nd process python
(don't know what it is doing ?) use 16CPUs so all CPUs.
With parallelization on 5 nodes : read_chunks
takes 1-4CPUs and python
ONLY ONE CPU on each node instead of 16 CPUs. I guess that's why it takes so long !!
Is it because Slurm is not officially support ?
Do you think I can modify something in the template to change that ?
Thanks for you answer.
Do you mean allow multiple jobs by node ? Or allow jobs to be submitted from a compute node ?
I will try your way
Allow jobs to be submitted by a job that is already running on a compute node. Main job that you submit takes care of running sub-jobs as needed.
After verification, my cluster is allow to submit sub-jobs. I tried with your way (write the cellranger command in a bash script and wrap it), the only difference I see is that the
mrp
process (which is the process that submit all jobs) run on a compute node instead of running on head node.But the computing time is still the same : It is taking really longer when I use 5 nodes instead of one.
Did you already tried to run cellranger on a single node ? Is it slower for you ?
Since you have a large dataset there may be not much to do speed this process up. You could try assigning more cores which may help with alignment steps (
STAR
) but otherwise the steps that run on single cores will always be the bottle-neck.Ok but still, it doesn't explain why it is faster in
job submission mode
on 1 compute node and slower incluster mode
on 5 compute nodes. With same parameters. It should be faster incluster mode
on 5 nodes. It is what they are saying in documentation :For me it is dramatically increasing time to solution to use
cluster mode
.Steps that run on single cores in
cluster mode
should also run on single cores when I submit on a single node, it's not the case.As a result, I will just submit on only one node instead of using
cluster mode
, it's too bad.We are not using
#SBATCH --nodes=1 --ntasks-per-node=__MRO_THREADS__
. That may be limiting python to one core per node. We control the total number of jobs started by cellranger by using--maxjobs=24
. If I wanted to have more cores then I would increase this number. For us it has not been a big issue since largest NovaSeq runs we have done have taken less than 24h to complete.If you are willing then I would take out that directive above and then increase the
--maxjobs=24
to a higher number and see if that helps. You may also want to take this out#SBATCH --no-requeue
unless needed for your cluster.I tried without
#SBATCH --nodes=1 --ntasks-per-node=__MRO_THREADS__
.I am using the example data for cellranger :
it is not a large dataset, it takes less than 24h to complete : 7h with
job submission mode
on 1 compute node and 12h withcluster mode
on 5 compute nodes. I would like to have 7h or less but incluster mode
.I tried with
--maxjobs=24
also with--maxjobs=100
it doesn't change anything, I have always only 5 jobs running (one by node) and the rest in waiting queue. I can't submit multiple jobs on a node, because it is suppose to use all CPUs.Is that data from 10x's site? I may try it out if I find the time. Every cluster is setup differently and it is possible that something on your cluster is causing this. Have you tried to work with your cluster admins to see if they can help?
Thanks for you time, it would be great if you could try it out but I understand if you don't find the time. I will try to check my slurm configuration in details.
I found all links on this page of 10x : https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/tutorials/gex-analysis-nature-publication
The data I am using : https://sra-pub-src-1.s3.amazonaws.com/SRR7611048/C07.bam.1
I had to do
bamtofastq C07.bam.1 irradiated
the reference : http://cf.10xgenomics.com/supp/cell-exp/refdata-cellranger-mm10-3.0.0.tar.gz
and then
cellranger count --id=irradiated --transcriptome=/path/to/refdata-cellranger-mm10-3.0.0 --fastqs=./indepth_C07_MissingLibrary_1_HL5G3BBXX,indepth_C07_MissingLibrary_1_HNNWNBBXX
We don't really have cluster admins, we administrate ourselves the cluster (we are 2 on the platform) so we have a standard configuration of slurm and so far, no particular issues about it.
It took about 5.5 h for the jobs to finish. I did not see a huge difference by throwing more cores in the pool. Jobs using 44 cores instead of 24 led to jobs finishing in more or less the same time. Disclaimer: our cluster stays busy so it is possible some sub-jobs many have
pended
for sometime.