Hi, everyone,
I am working with wheat exome data, which are Illumina PE 100 reads. I analysed the data followed the GATK pipeline. Read were aligned to references using BWA and the BAM files were sorted by samtools. But when I tried to mark PCR dupication using Picard, I got error:
java -Xms16g -Xmx256g -jar MarkDuplicates.jar I=IN.bam O=OUT.DeDup.bam METRICS_FILE=OUT.dedup REMOVE_DUPLICATES=false MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=LENIENT
Error :Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
I also tried to other options:
java -Xmx160g -jar MarkDuplicates.jar I=IN.bam O=OUT.DeDup.bam METRICS_FILE=OUT.dedup REMOVE_DUPLICATES=false MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=LENIENT
Error:java.lang.OutOfMemoryError: Java heap space
The BAM file is 4G, java version is 1.7.0_11, picard is version 1.92 What can I do to solve this problem?
Thank you !
Xiaoyan
It should be "-XX:MaxPermSize=1g", but I don't think this is an issue. Allocating (trying) so much memory is the one, IMO.
Oh yea. Sorry about. Fixed it. And it could be like you said too. But I'm curious to know how much RAM his system actually has!
The linux system has 1024g RAM
Sounds good then. What is OS and JVM implementation you are running? And what would be the output if you specify
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
?