I was running the JointDiscovery pipeline as a part of the GATK Best Practices pipeline. I am running this on many vcf files (~150) called by the HaplotypeCaller. I am getting this error:
23:32:16.900 INFO ProgressMeter - Traversal complete. Processed 19816687 total variants in 613.7 minutes.
23:32:17.434 INFO VariantDataManager - QD: mean = 22.14 standard deviation = 6.21
23:32:18.366 INFO VariantDataManager - MQRankSum: mean = -0.01 standard deviation = 0.25
23:32:19.385 INFO VariantDataManager - ReadPosRankSum: mean = 0.08 standard deviation = 0.47
23:32:20.172 INFO VariantDataManager - FS: mean = 1.38 standard deviation = 3.42
23:32:20.947 INFO VariantDataManager - MQ: mean = 59.89 standard deviation = 2.23
23:32:21.711 INFO VariantDataManager - SOR: mean = 0.70 standard deviation = 0.22
23:32:22.473 INFO VariantDataManager - DP: mean = 6835.64 standard deviation = 3532.38
01:01:24.432 INFO VariantDataManager - Annotations are now ordered by their information content: [DP, MQ, QD, SOR, MQRankSum, FS, ReadPosRankSum]
01:02:53.662 INFO VariantDataManager - Training with 6999956 variants after standard deviation thresholding.
01:02:53.662 WARN VariantDataManager - WARNING: Very large training set detected. Downsampling to 2500000 training variants.
01:05:04.968 INFO VariantRecalibrator - Shutting down engine
[September 18, 2019 1:05:04 AM EDT] org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator done. Elapsed time: 706.54 minutes.
Runtime.totalMemory()=3208118272
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.broadinstitute.hellbender.tools.walkers.vqsr.MultivariateGaussian.<init>(MultivariateGaussian.java:31)
at org.broadinstitute.hellbender.tools.walkers.vqsr.GaussianMixtureModel.<init>(GaussianMixtureModel.java:34)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:43)
at org.broadinstitute.hellbender.tools.walkers.vqsr.VariantRecalibrator.onTraversalSuccess(VariantRecalibrator.java:625)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:895)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:134)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:179)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:198)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
I believe this is derived from an error earlier in the log, since the stderr gives the same Java heap space error:
[2019-09-16 19:05:59,50] [error] WorkflowManagerActor Workflow 9f7a01a4-0632-4817-8622-aa51e520abf1 failed (during ExecutingWorkflowState): Job JointGenotyping.SNPsVariantRecalibratorClassic:NA:1 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.
Check the content of stderr for potential additional information: /path/to/stderr.
I have read past issues (https://gatkforums.broadinstitute.org/gatk/discussion/23880/java-heap-space) regarding this that may suggest it is a bug. It has pointed me to increasing the available heap memory through the primary command of -Xmx. Is this the way to do it?
java -Xmx600G -Dconfig.file=' + re.sub('input.json', 'overrides.conf', input_json) + ' -jar ' + args.cromwell_path + ' run ' + re.sub('input.json', 'joint-discovery-gatk4.wdl', input_json) + ' -i ' + input_json
where I substitute in the corresponding config, json, and wdl files. Is 600G enough? Each vcf is around 6G large and since I have 150, does that mean I should be allocating more than 900G (6G x 150)?
By the way, I would post this to the GATK forums but it's taking too long to get a verified account.
'600G' do you have a server with 600G of memory ?
Yes. But I'm not sure where the error is originating from (ie. is it a subprocess call in the wdl file?). I'm confused because it says Runtime.totalMemory()=3208118272, which is only ~3G, but I definitely specified more memory allocation than that.
I also looked at the task
SNPsVariantRecalibrator
memory allocation and it's more than 3G as well, but I'm not sure if this is the problem or how high I should be making it.Probably does not need 600G, since it only used about 3G. I would start with specifying 32G~64G.
But why would it error out even if I overspecify?
Did it give you the same "heap space" message and "RuntimeTotalMemory" after you specified 600G and ran the command?
Yes, I got the same message
Then I don't understand. Could you post this to GATK help forum?