Entering edit mode
10.5 years ago
I want to run EstimateLibraryComplexity.jar
with a 9.8GB big bam file, but I always get a OutOfMemoryError
. I already tried -Xmx
(up to 60GB) and still get the error. Has anybody an idea of how to run EstimateLibraryComplexity
on bigger bam files?
That's my call and the error message:
$ java -Xmx10g -jar EstimateLibraryComplexity.jar INPUT=file.bam OUTPUT=file.libraryComplexity
[Wed Jun 04 21:43:08 CEST 2014] picard.sam.EstimateLibraryComplexity INPUT=[file.bam] OUTPUT=file.libraryComplexity MIN_IDENTICAL_BASES=5 MAX_DIFF_RATE=0.03 MIN_MEAN_QUALITY=20 MAX_GROUP_RATIO=500 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Jun 04 21:43:08 CEST 2014] Executing as me@work on Linux 3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) JdkDeflater
INFO 2014-06-04 21:43:08 EstimateLibraryComplexity Will store 15494157 read pairs in memory before sorting.
INFO 2014-06-04 21:43:13 EstimateLibraryComplexity Read 1,000,000 records. Elapsed time: 00:00:05s. Time for last 1,000,000: 5s. Last read position: chr10:38,239,480
....
INFO 2014-06-04 21:53:21 EstimateLibraryComplexity Read 30,000,000 records. Elapsed time: 00:10:13s. Time for last 1,000,000: 183s. Last read position: chr15:34,522,127
[Wed Jun 04 22:54:26 CEST 2014] picard.sam.EstimateLibraryComplexity done. Elapsed time: 71.30 minutes.
Runtime.totalMemory()=5801312256
To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.<init>(String.java:203)
at java.lang.String.substring(String.java:1913)
at htsjdk.samtools.util.StringUtil.split(StringUtil.java:89)
at picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:71)
at picard.sam.EstimateLibraryComplexity.doWork(EstimateLibraryComplexity.java:256)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
at picard.sam.EstimateLibraryComplexity.main(EstimateLibraryComplexity.java:217)
And that's the java version:
$ java -showversion
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
EDIT: I also posted this question at SEQanswers!
This smacks of a bug in the program. Especially since it happened after over an hour of runtime. What version of picard tools are you running?
Picard version: 1.114
Thanks! Now I see it was waaaay over to the right in your original post! >.<
Out of curiosity, did it actually max out the space allocated when you used
-Xmx60g
?I don't know any more. But when I check the used memory for the run above, it looks like it only used ~5GB (
Runtime.totalMemory()=5801312256
), doesn't it?Indeed, this sounds like a bug. You might post a message to the samtools-help email list and see if one of the authors have run into this (if not, it looks like there's a bug report to be filed).
Here's another possibility: your tmp location is being filled up by the operation, so the error is actually triggered when you run out of swap disk. Do you mind checking the location of your /tmp/ folder, and the amount of free space on its host volume?
In the past I've resolved this by symlinking /tmp to a folder on a large volume.
I tracked the free space of the volume and the size of the /tmp/ folder and both are far away from being filled up. But thanks for the idea... was worth a try.
Have you tried raising the
MIN_IDENTICAL_BASES
parameter to something like 10 or even 15? With a BAM file that size, it actually makes sense that you would run out of memory during the sort step.Hi David,
I know it's been a long time since you posted this thread.
I was curious to know on the resolution of the error ?
Could you please update the thread ?
Thanks
Sorry, but there is no update. I just stopped using PicardTools.