Hi everyone,
I'm running bwa in the sampe mode and, after successfully processing >10M reads it fails with a segmentation fault (as follows) on what appears like a set of poorly-alignable reads. Any suggestions on what can be done to overcome this problem would be much appreciated. Many thanks!
#chunk processed ok
[bwa_read_seq] 2.8% bases are trimmed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (492, 529, 561)
[infer_isize] low and high boundaries: 354 and 699 for estimating avg and std
[infer_isize] inferred external isize from 55618 pairs: 523.901 +/- 55.549
[infer_isize] skewness: -0.798; kurtosis: 0.914; ap_prior: 2.38e-04
[infer_isize] inferred maximum insert size: 897 (6.71 sigma)
[bwa_sai2sam_pe_core] time elapses: 9.46 sec
[bwa_sai2sam_pe_core] changing coordinates of 9766 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 76039 out of 91346 Q17 singletons are mated.
[bwa_paired_sw] 5151 out of 16392 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 52.09 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 4.60 sec
[bwa_sai2sam_pe_core] print alignments... 1.93 sec
[bwa_sai2sam_pe_core] 11010048 sequences have been processed.
# failed chunk
[bwa_read_seq] 3.1% bases are trimmed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] fail to infer insert size: too few good pairs
[bwa_sai2sam_pe_core] time elapses: 11.07 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 0 out of 0 Q17 singletons are mated.
[bwa_paired_sw] 68090 out of 140740 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 63.65 sec
[bwa_sai2sam_pe_core] refine gapped alignments...
/spool/1339511835.1833481: line 8: 9921 Segmentation fault (core dumped)
../../bwa-0.6.1/bwa sampe -a 2000 ../myGenome.fa myReads.PE500.4.1.sai myReads.PE500.4.2.sai myReads.PE500.4.1.fq myReads.PE500.4.2.fq > myReads.PE500.4.bwape.sam
I had some seg fault problems with sampe awhile ago and after trying 101 different things, I eventually gave up and used bowtie2. So I'm curious to see what happens here.
That said, I would suggest mentioning what version of bwa you're using (or have tried) and hopefully post part of your failing data file somewhere if you want others to take a look.
Haha. We both posted the same time with the same answer.
I posted this question to the bwa mailing list and received the following patch from Mark Kelly that may fix this bug:
Am re-running with the patched version now and if it works, will post it as an update to the question and close it.
Update: With this patch, the run lasted a little longer but eventually died in a very similar setting. Perhaps the patch is in the right place, but is not fully adequate...
I had this exact problem and tried all the usual suspects. Sadly I couldn't figure it out and ended up using bowtie 2.
Same here:
bwa sampe -s -A -r "@RG\tID:id\tLB:foo\tSM:id\tPL:ILLUMINA" hg19 1.sai 2.sai 1.fq 2.fq ... [bwasai2sampe_core] convert to sequence coordinate... Segmentation fault (core dumped)
This happens within 2 seconds of starting the program. For me, this result is reproducible:
different @RG string lengths, outputting to .sam file or piping to samtools.
I had this exact same seg fault 2 seconds after running. The sai file was produced on a cluster using bwa 0.7, and I was running sampe on another server with bwa 0.6. When I installed and ran
bwa sampe
using bwa 0.7, it worked perfectly. Might be worth trying the new version. -- Hope that helps people with this same issue.