Hi community,
Recently I have used BWA and Bowtie2 to align simulated DNA sequencing data to test our sequencing simulator. I got some errors from both aligners:
BWA: submit.sh: line 48: 6881 Segmentation fault (core dumped)
BOWTIE2: terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)
(ERR): bowtie2-align exited with value 134
or (ERR): bowtie2-align died with signal 6 (ABRT)
I searched online and found most posts said this kind of error is related to the memory shortage, so I monitor the memory usage during the alignment. I found BWA consistently took ~9GB and BOWTIE2 consistently took ~5GB in total. I also ran a script to check the memory every 30 seconds and found both of the aligners occupied no more than 10% of the memory and there is always ~100GB memory available. I then tried using fewer threads (5 threads for example) and assign each thread 9GB memory but still got the same error. So I feel it is unlikely the memory issue.
The data I am aligning that throws out such error is having 100x coverage for human genome so a single fasts file would be 300-400GB. I also tried lower depth (e.g.15x coverage) data using the same simulator and the alignment can be done without issue. I am not sure if this is due to the simulated data is too deep but I feel it is just the number of total read and the aligner would take a longer time to finish rather than throw out an error.
Does anyone encounter a similar issue or know what might be an issue or can give some hint on how to fix it? Many thanks!
Thanks for the reply and suggestion
I will monitor the memory every second to see if there is a spike that exceed the total amount of memory, thanks!
Do you use a job scheduler such as SLURM, if so please add the submission parameters.
I used SLURM for a larger server, the command is some like:
sbatch -p xxx -w xxx -t 71:00:00 -c 16 --mem 46G script.sh
orsbatch -p xxx -w xxx -t 71:00:00 -c 5 --mem-per-cpu 9G script.sh
-p xxx
and-w xxx
point to the pool and compute node the server defined, andscript.sh
contain the command to runbwa
orbowtie2
For the memory test, I used a smaller cluster that is a normal ubuntu system so I just open a
screen
and runscript.sh
thanks for your time
I monitored the memory usage every second, and plot it as follows. Since it consistently took 2.7% until existing, I only crop the short time frame close to the end point There is spike but that only took around 25% of the usage.
I tried 50x coverage and got the same error... now I am confused .. any suggestions for debugging? Should it be something wrong with the simulator? thanks!
I think this is almost certainly a memory issue. You can (dis)prove that: try your command with a single CPU, use more memory than 46G, and take out the part that says
--mem-per-cpu 9G
. I realize doing it that way will be slow, but if it goes without a problem it means that your combination of total memory, # of CPUs and memory allocation per CPU are not giving the program enough memory to work with.Alternatively, your total memory should be at least 10% higher than 5*9G, because scripts take up memory for other reasons than just what BWA or BOWTIE2 need.
thanks for your suggestion. I did more tests on different coverage of simulated data using different aligners. Bowtie2 finished on 25x and 10x coverage without error but failed on 5x data with
std::bad_alloc
error. I also test minimap2, bwa mem, bowtie2 on the same 15x coverage data on the nodes that have the same configuration (8cores, 23G memory), only bwa mem finished without error, bowtie2 existed withstd::bad_alloc
error, and minimap2 existed withSegmentation fault (core dumped)
andSEQ and QUAL are of different length
message. Is this possible? bwe mem also failed on the previous 100x data withSegmentation fault (core dumped)
orSEQ and QUAL are of different length
error. Do I miss some key options for those aligners? Currently, I just define the input file, output file, number of threads and reference genome for all these aligners. Thanks!SEQ and QUAL are of different length
message points to a different problem, and that could be an issue with a simulator rather than memory. It means that in one or more of your reads the length of sequence line is not the same as the length of the line with quality values. You can remove those reads withreformat.sh
which is part of the BBtools package:thanks for the quick response, but what confused me is that for the same data, bwa mem finished without error but minimap2 got such error, and bowtie2 still existed with the
std::bad_alloc
error, is this possible? Or I missed some configuration for the tools?I have seen it before that some aligners quit when faced with unequal read and quality lengths, and others can just power through it. If you know there is a problem, I think it is always a good idea to fix it.
thanks, we run
fastQValidator
on the simulator data and found that there are a lotrepeated sequenced identifier
:but did not find any error related to
unequal read and quality lengths
, do you think therepeated sequenced identifier
error is the reason to blow the memory? if so, how? Thanks!