Error in malloc (SPAdes assembler)
1
0
Entering edit mode
3.9 years ago
asgiraldoc • 0

Hi all,

I have 3 paired-end libraries from Illumina sequencing (151bp). Each library has almost 15M of reads. However, when I run spades I'm running out of RAM when the assembly starts with kmers 55; the error looks like this: jemalloc: Error in malloc(): out of memory Requested: 8388608. So, the genome assembly could not be completed. It's a small genome (25Mb).

I'm working on a server with 1.5Tb of RAM. And this is my code:

spades.py --careful -k 55,77,99 -t 32 -m 1000 --pe1-1 work/mapeo2_F.fastq 
--pe1-2 work/mapeo2_R.fastq --pe1-1 work/mapeo1_F.fastq --pe1-2 work/mapeo1_R.fastq --pe1-
1 work/mapeo3_F.fastq --pe1-2 work/mapeo3_R.fastq -o work/Hcol

Do you have any idea of how to overcome that error?

Assembly genome sequencing software error • 1.1k views
ADD COMMENT
0
Entering edit mode

You may have too much data for a small genome. Consider normalizing your sequence reads with bbnorm.sh before trying this assembly.

I'm working on a server with 1.5Tb of RAM

Are you the only user on this machine? If not, your program may have much less RAM available to work with.

ADD REPLY
0
Entering edit mode

Not related to RAM consumption, but you run Spades in a wrong way. If you have three libraries, you should provide them with --pe1-1, --pe1-2, --pe2-1, --pe2-2, --pe3-1, --pe3-2. The number after "pe" is the number of the library.

ADD REPLY
0
Entering edit mode

You can also check how much memory is actaully free with free -h. You actaul free memory will be somwhere between the "free" column and the "available" column. In theory all the memory in the "available" column should be accessible to you, but in practice we've found that this sometimes isn't the case (e.g. we once had a case where a memory mapped file was being kept in memory after the termination of the program that used it, and wasn't being released when the OS asked for it).

ADD REPLY
0
Entering edit mode
3.9 years ago
shelkmike ★ 1.4k

Spades monitors RAM usage during its run. Can you look into its logs and find what was the RAM usage before Spades crashed?
Another method to find peak RAM consumption of a program is to run it with a command which starts from "/usr/bin/time -v", in your case the command will be:

/usr/bin/time -v spades.py --careful -k 55,77,99 -t 32 -m 1000 --pe1-1 work/mapeo2_F.fastq --pe1-2 work/mapeo2_R.fastq --pe1-1 work/mapeo1_F.fastq --pe1-2 work/mapeo1_R.fastq --pe1-1 work/mapeo3_F.fastq --pe1-2 work/mapeo3_R.fastq -o work/Hcol

and then look at the line "Maximum resident set size (kbytes):" of the output when Spades crashes.

In my experience, 1.5Tb RAM should definitely be enough to assemble a 25 megabase genome.

ADD COMMENT

Login before adding your answer.

Traffic: 1621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6