Hello all,
I am using bwa mem to align contigs to a reference genome and I observed that it is exiting with an error message "Segmentation Fault". For example:
[M::main_mem] read 1 sequences (405327 bp)...
Segmentation fault
Upon closer examination, I found that whenever the size of input sequence >=0.4 Mbp, I see this message. In bwa mem page (http://bio-bwa.sourceforge.net/bwa.shtml), it is mentioned that it can handle contigs upto 1Mbp. I am unable to figure out the source of this upper limit at 0.4Mbp. Is it specific to my computer or is there any parameter by which I can assign more memory to bwa?
I am currently thinking of breaking contigs greater than 0.4Mbp into overlapping smaller contigs of size 399999. Is there any other way to go about it.
Thanks all,
Ramya
cross posted: http://seqanswers.com/forums/showthread.php?p=138477
@Pierre Lindenbaum I am aliging contigs of a hybrid to it's parents genomes. Thus, I have reason to believe that though the query and the reference will be very similar sequence wise(>95% similar), I expect to see gross chromosomal rearrangements. Thus I believe, blat which is suited for aligning at the 'DNA level between two sequences that are of 95% or greater identity, but which may include large inserts', may not work well for me. What are your thoughts?
@Chris Fields Oh so bwa-mem did run for such large contigs. Now I wonder !!! I ran bwa-mem with a single contig of size 405327 bp and it exited with 'Segmentation fault'. When I ran it for a single contig of size 397720 bp, it ran perfectly well, thus my conclusion. Maybe there was some parameter you changed while compiling the source code and I kept the default. Or did you run it on a system with more RAM than mine (mine is 16GB RAM). Just guesses.
I ran bwa-sw and it worked for me. So I can go ahead, at least for the time being.
Thanks,
Ramya
If BWA-MEM segfaults, it is a bug. Is your data public? Could you share the data with me? I will only use your data for the debugging purpose. Thank you.
@lh3-Alt Thanks for your help. I have prepared a folder 'Debug' with following files: (1) query_sample.fsa (1 sequence) (2) target_sample.fsa (3) files generated upon indexing the target_sample.fsa (4) readme.txt with the exact commands as I ran. The bwa-mem exits with a "Segmentation fault" for this problem as well. I have uploaded a tar file of this folder in my google drive (https://drive.google.com/file/d/0ByTU79pGWWI0TmJkaWZsem1MRnM/edit?usp=sharing) as I could not find a way to upload it in biostars itself.
Kindly get back in case you need more inputs.
Thanks,
Ramya
Thanks for the example. I see you were using bwa-0.7.0. Please try more recent versions. The first release of bwa-mem indeed has bugs, many of which have been fixed over time.
@lh3-Alt Yes, you were right. It ran perfectly with the latest version (bwa-0.7.8). Thankyou everyone for the inputs and help :-)
Yours Truly,
Ramya
PS: How do I accept this as the solution and close the thread?
Ah, one of the problems I have seen in Biostar, namely that the answers are in comments. Basically, @lh3-Alt could post that using the latest bwa-mem is the answer below, and you would accept it. Or you could post the answer and indicate @lh3-Alt answered it in the comments, then accept it :)
Also, see this thread regarding some reasoning why you can't accept a comment as an answer.