I'm running Mutect2
on some WES data. The .bam
file is 4.7G, and I'm comparing it against the hg38 reference genome. I allocated 8 CPUs and 90G of memory using slurm, but progress has been very slow. If I wanted the job to complete for single sample within ~24 hours, what sort of CPU and memory allocation should I be using?
If there are parts of this pipeline that are single threaded there is not much you can do to speed things up.
According to this post on the GATK forums,
Mutect2
does not support multithreading. With 100G of RAM, it took 4.35 hours to process the first chromosome. This is usinggatk v4.1.2
, which supposedly has "significant speed improvements."