I am running the gatk collect allelic counts function, but there is an error that I don't know how to fix. Interval list is downloaded from gatk resource bundle: https://storage.cloud.google.com/genomics-public-data/resources/broad/hg38/v0/wgs_calling_regions.hg38.interval_list and then I use gatk processinterval function to create it.
my script is below:
WD="/home/Desktop/CNV"
REF="${WD}/ref/hg38.fasta"
INT="${WD}/ref/wgs.hg38.interval_list"
DICT="${WD}/ref/hg38.fasta.dict"
time gatk --java-options "-Xmx16g -Djava.io.tmpdir=${TMPFILE}" CollectAllelicCounts \
--intervals ${INT} \
--input ${NAME}.addRG.mkdup.recal.bam \
--reference ${REF} \
--tmp-dir ${TMPFILE} \
--sequence-dictionary ${DICT} \
--output ${NAME}.allelic_counts.tsv
and the error is below:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3181) at java.util.ArrayList.grow(ArrayList.java:265) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231) at java.util.ArrayList.add(ArrayList.java:462) at org.broadinstitute.hellbender.tools.copynumber.datacollection.AllelicCountCollector.collectAtLocus(AllelicCountCollector.java:72) at org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts.apply(CollectAllelicCounts.java:152) at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176) at org.broadinstitute.hellbender.engine.LocusWalker$$Lambda$91/1519482659.accept(Unknown Source) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)
I tried to increase the memory but it didn't work as well.
What is
${TMPFILE}
?a temporary directory
I would test the script with a small subset of both the BAM file and the interval file to see if it is indeed a memory/size error or something more general.
yes I tried to subset the bam file and interval file as well. but also have similar error
[May 22, 2020 12:15:40 PM HKT] org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts done. Elapsed time: 21.76 minutes. Runtime.totalMemory()=15772155904 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3181) at java.util.ArrayList.grow(ArrayList.java:265) at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239) at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231) at java.util.ArrayList.add(ArrayList.java:462) at org.broadinstitute.hellbender.tools.copynumber.datacollection.AllelicCountCollector.collectAtLocus(AllelicCountCollector.java:72) at org.broadinstitute.hellbender.tools.copynumber.CollectAllelicCounts.apply(CollectAllelicCounts.java:152) at org.broadinstitute.hellbender.engine.LocusWalker.lambda$traverse$0(LocusWalker.java:176) at org.broadinstitute.hellbender.engine.LocusWalker$$Lambda$91/2118482375.accept(Unknown Source) at java.util.Iterator.forEachRemaining(Iterator.java:116) at org.broadinstitute.hellbender.engine.LocusWalker.traverse(LocusWalker.java:174) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)
real 21m49.559s user 140m38.016s sys 0m15.373s
I already subset the bam file from 10X to 1X but the error is still there. I used the gatk4.1.7 in conda environment.
Then the error is more general. Try to run it without variables such as $tmp and outside of the script you are using to narrow down the problem.
yes. tried. no variables in the script. same error came out.
Then I would contact the developers.