Hi all,
I tried running a hybrid assembly for a fungal genome. I have one set of PE reads from Illumina and a MinIon dataset. It should be about 45 Mbases when complete. I ran it on a machine with 35 cores and 250 Gb of RAM.
I am running SPAdes v 3.10.1 and it errorred out with error code "-6" and the message:
<jemalloc>: Error in malloc(): out of memory. Requested: 94287658224, active: 32942063616
In the manual it states: "SPAdes uses 512 Mb per thread for buffers, which results in higher memory consumption. If you set memory limit manually, SPAdes will use smaller buffers and thus less RAM."
For this run, I specified 35 cores. 35 cores * 512 Mb = 18 Gb.
However, my machine has 250 Gb of RAM, so this should be well within its limits?
Thanks for any suggestions on how to modify my approach!
The log is below:
Command line:
/home/lina/SPAdes-3.10.1-Linux/bin/spades.py \
-1 /lina/analysis/nanopore/t111680/illumina/Sample_3701022/3701022_S3_R1_001_paired.fastq.gz \
-2 /lina/analysis/nanopore/t111680/illumina/Sample_3701022/3701022_S3_R2_001_paired.fastq.gz \
--nanopore /lina/analysis/nanopore/t111680/1d2/1d2.fastq \
--threads 35 -o /lina/analysis/nanopore/t111680/spades_out
System information:
SPAdes version: 3.10.1
Python version: 2.7.12
OS: Linux-4.4.0-83-generic-x86_64-with-Ubuntu-16.04-xenial
Output dir: /lina/analysis/nanopore/t111680/spades_out
Mode: read error correction and assembling
Debug mode is turned OFF
Dataset parameters:
Multi-cell mode (you should set '--sc' flag if input data was obtained with MDA (single-cell) technology or --meta flag if processing metagenomic dataset)
Reads:
Library number: 1, library type: paired-end
orientation: fr
left reads: ['/lina/analysis/nanopore/t111680/illumina/Sample_3701022/3701022_S3_R1_001_paired.fastq.gz']
right reads: ['/lina/analysis/nanopore/t111680/illumina/Sample_3701022/3701022_S3_R2_001_paired.fastq.gz']
interlaced reads: not specified
single reads: not specified
Library number: 2, library type: nanopore
left reads: not specified
right reads: not specified
interlaced reads: not specified
single reads: ['/lina/analysis/nanopore/t111680/1d2/1d2.fastq']
Read error correction parameters:
Iterations: 1
PHRED offset will be auto-detected
Corrected reads will be compressed (with gzip)
Assembly parameters:
k: automatic selection based on read length
Repeat resolution is enabled
Mismatch careful mode is turned OFF
MismatchCorrector will be SKIPPED
Coverage cutoff is turned OFF
Other parameters:
Dir for temp files: /lina/analysis/nanopore/t111680/spades_out/tmp
Threads: 35
Memory limit (in Gb): 250
======= SPAdes pipeline started. Log can be found here: /lina/analysis/nanopore/t111680/spades_out/spades.log
===== Read error correction started.
== Running read error correction tool: /home/lina/SPAdes-3.10.1-Linux/bin/hammer /lina/analysis/nanopore/t111680/spades_out/corrected/configs/config.info
0:00:00.000 4M / 4M INFO General (main.cpp : 83) Starting BayesHammer, built from N/A, git revision N/A
0:00:00.019 4M / 4M INFO General (main.cpp : 84) Loading config from /lina/analysis/nanopore/t111680/spades_out/corrected/configs/config.info
0:00:00.022 4M / 4M INFO General (memory_limit.hpp : 47) Memory limit set to 250 Gb
0:00:00.022 4M / 4M INFO General (main.cpp : 93) Trying to determine PHRED offset
0:00:00.023 4M / 4M INFO General (main.cpp : 99) Determined value is 33
0:00:00.024 4M / 4M INFO General (hammer_tools.cpp : 36) Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ]
0:00:00.024 4M / 4M INFO General (main.cpp : 120) Size of aux. kmer data 24 bytes
=== ITERATION 0 begins ===
0:00:00.026 4M / 4M INFO K-mer Index Building (kmer_index_builder.hpp : 428) Building kmer index
0:00:00.026 4M / 4M INFO K-mer Splitting (kmer_data.cpp : 91) Splitting kmer instances into 560 buckets. This might take a while.
0:00:00.026 4M / 4M INFO General (file_limit.hpp : 30) Open file limit set to 1024
0:00:00.026 4M / 4M INFO General (kmer_index_builder.hpp : 108) Memory available for splitting buffers: 2.38092 Gb
0:00:00.026 4M / 4M INFO General (kmer_index_builder.hpp : 116) Using cell size of 119837
0:01:15.991 19G / 19G INFO K-mer Splitting (kmer_data.cpp : 98) Processing /lina/analysis/nanopore/t111680/illumina/Sample_3701022/3701022_S3_R1_001_paired.fastq.gz
0:02:30.115 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 8671797 reads
0:03:52.462 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 17757244 reads
0:05:14.826 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 26847635 reads
0:06:38.972 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 35937553 reads
0:08:08.889 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 45073098 reads
0:09:40.280 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 54072450 reads
0:11:06.362 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 63129760 reads
0:12:31.079 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 72145772 reads
0:14:03.454 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 81288864 reads
0:15:29.876 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 90331193 reads
0:16:59.401 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 99462762 reads
0:18:25.410 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 108476084 reads
0:22:48.227 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 135849570 reads
0:23:35.285 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 98) Processing /lina/analysis/nanopore/t111680/illumina/Sample_3701022/3701022_S3_R2_001_paired.fastq.gz
0:50:39.101 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 108) Processed 271916466 reads
0:54:06.993 19G / 20G INFO K-mer Splitting (kmer_data.cpp : 113) Total 278480422 reads processed
0:54:08.662 140M / 20G INFO General (kmer_index_builder.hpp : 252) Starting k-mer counting.
1:04:54.515 140M / 20G INFO General (kmer_index_builder.hpp : 258) K-mer counting done. There are 3928652426 kmers in total.
1:04:54.515 140M / 20G INFO General (kmer_index_builder.hpp : 260) Merging temporary buckets.
1:11:48.688 140M / 20G INFO K-mer Index Building (kmer_index_builder.hpp : 437) Building perfect hash indices
1:11:48.688 140M / 20G WARN K-mer Index Building (kmer_index_builder.hpp : 451) Number of threads was limited down to 24 in order to fit the memory limits during the index construction
1:15:39.878 1G / 98G INFO General (kmer_index_builder.hpp : 276) Merging final buckets.
1:20:35.719 1G / 98G INFO K-mer Index Building (kmer_index_builder.hpp : 483) Index built. Total 1283564560 bytes occupied (2.61375 bits per kmer).
1:20:35.724 1G / 98G INFO K-mer Counting (kmer_data.cpp : 359) Arranging kmers in hash map order
1:29:42.975 59G / 98G INFO General (main.cpp : 155) Clustering Hamming graph.
2:52:58.884 59G / 98G INFO General (main.cpp : 162) Extracting clusters
3:51:36.260 59G / 129G INFO General (main.cpp : 174) Clustering done. Total clusters: 1071603427
3:51:39.094 30G / 129G INFO K-mer Counting (kmer_data.cpp : 381) Collecting K-mer information, this takes a while.
<jemalloc>: Error in malloc(): out of memory. Requested: 94287658224, active: 32942063616
== Error == system call for: "['/home/lina/SPAdes-3.10.1-Linux/bin/hammer', '/lina/analysis/nanopore/t111680/spades_out/corrected/configs/config.info']" finished abnormally, err code: -6
======= SPAdes pipeline finished abnormally and WITH WARNINGS!
=== Error correction and assembling warnings:
* 1:11:48.688 140M / 20G WARN K-mer Index Building (kmer_index_builder.hpp : 451) Number of threads was limited down to 24 in order to fit the memory limits during the index construction
======= Warnings saved to /lina/analysis/nanopore/t111680/spades_out/warnings.log
=== ERRORs:
* system call for: "['/home/lina/SPAdes-3.10.1-Linux/bin/hammer', '/lina/analysis/nanopore/t111680/spades_out/corrected/configs/config.info']" finished abnormally, err code: -6
In case you have troubles running SPAdes, you can write to spades.support@cab.spbu.ru
Please provide us with params.txt and spades.log files from the output directory.
Not sure what unit that
94287658224
is in, but even if it is kilobytes it is still a lot more RAM than you have available.yes, that number is very large! I am not entirely sure what the units are but the math doesn't seem to work out either way :-/
It's actually not that large. It's probably in Bytes, which would correspond to about 88 GB. This is manageable by high memory cloud servers or even HPC servers