Question

Picard Markduplicates Error: Requested Array Size Exceeds Vm Limit

1

Entering edit mode

12.2 years ago

xiaoyanli82 ▴ 10

Hi, everyone,

I am working with wheat exome data, which are Illumina PE 100 reads. I analysed the data followed the GATK pipeline. Read were aligned to references using BWA and the BAM files were sorted by samtools. But when I tried to mark PCR dupication using Picard, I got error:

 java -Xms16g -Xmx256g -jar MarkDuplicates.jar I=IN.bam O=OUT.DeDup.bam METRICS_FILE=OUT.dedup REMOVE_DUPLICATES=false MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=LENIENT

Error :Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit

I also tried to other options:

java -Xmx160g -jar MarkDuplicates.jar I=IN.bam O=OUT.DeDup.bam METRICS_FILE=OUT.dedup REMOVE_DUPLICATES=false MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=LENIENT

Error:java.lang.OutOfMemoryError: Java heap space

The BAM file is 4G, java version is 1.7.0_11, picard is version 1.92 What can I do to solve this problem?

Thank you !

Xiaoyan

picard markduplicates • 10k views

ADD COMMENT • link updated 12.2 years ago by Jordan ★ 1.3k • written 12.2 years ago by xiaoyanli82 ▴ 10

score 2 · Answer 1 · 2013-05-23

2

Entering edit mode

12.2 years ago

Jordan ★ 1.3k

I'm just going to guess here. Did you try increasing your PermGen space? If your perm gen space is low, no matter how much heap space you increase, it does not matter.

Try this:

-XX:MaxPermSize=1g

Or how much ever you think is necessary.

ADD COMMENT • link 12.2 years ago by Jordan ★ 1.3k

0

Entering edit mode

It should be "-XX:MaxPermSize=1g", but I don't think this is an issue. Allocating (trying) so much memory is the one, IMO.

ADD REPLY • link 12.2 years ago by Pavel Senin ★ 1.9k

0

Entering edit mode

Oh yea. Sorry about. Fixed it. And it could be like you said too. But I'm curious to know how much RAM his system actually has!

ADD REPLY • link 12.2 years ago by Jordan ★ 1.3k

0

Entering edit mode

The linux system has 1024g RAM

ADD REPLY • link 12.2 years ago by xiaoyanli82 ▴ 10

0

Entering edit mode

Sounds good then. What is OS and JVM implementation you are running? And what would be the output if you specify -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+PrintGCDetails -XX:+PrintGCTimeStamps?

ADD REPLY • link 12.2 years ago by Pavel Senin ★ 1.9k

score 1 · Answer 2 · 2013-05-23

1

Entering edit mode

12.2 years ago

NB ▴ 960

I has the same problem for human genomes and adding "MAX_RECORDS_IN_RAM=5000000" helped me. Maybe you can try that, here's the code that I use

 java -Xmx16g -jar MarkDuplicates.jar    \
    I=IN.bam \
    O=OUT.bam \
    METRICS_FILE=dupmetrics.txt \
    REMOVE_DUPLICATES=true \
    MAX_RECORDS_IN_RAM=5000000 \
    ASSUME_SORTED=true \
    VALIDATION_STRINGENCY=SILENT \
    TMP_DIR=$TMPDIR \
    CREATE_INDEX=true \
    OPTICAL_DUPLICATE_PIXEL_DISTANCE=10

ADD COMMENT • link 12.2 years ago by NB ▴ 960

0

Entering edit mode

The error is due to Java heap space. The user tells the tool that it has 256 Gb of RAM that I don't think is normal. So I dont think "MAX_RECORDS_IN_RAM" is an issue.

ADD REPLY • link 12.2 years ago by Ashutosh Pandey 12k

0

Entering edit mode

I tried again following your suggestion, the problem was fixed, thank you very much!

ADD REPLY • link 12.2 years ago by xiaoyanli82 ▴ 10

score 0 · Answer 3 · 2013-05-23

0

Entering edit mode

12.2 years ago

Ashutosh Pandey 12k

Does your computer has 256 Gb RAM ? because you specified it on your command line -Xmx256g . If your bam file is 4 Gb, then I wont even care about changing that parameter and go with the default one.