I've been running BWA sampe; and it seems it just takes forever to finish. And stderr just get stuck at "Unmapped read alignment" step. And BWA also estimates extremely large insert size.
I read some other posts, and seems it's caused by that BWA sometimes will estimate unrealistic insert size (say till 1MB), which makes unmapped mate alignment very time-consuming because it has to try Smith-Waterman over a very large window.
I'm just wondering, is this a bug and bad luck? Why such errors happen? What should I do? Just re-run the program? Because I've been running BWA many many times, and this is the FIRST time ever for me to come across such problem.
Also someone suggests to use "-A" to disable insert size estimate, which however unfortunately will ignore umapped mate alignment, which is not good. Because I need such "problematic" reads for further analysis.
Thanks
I wrote a patch for BWA that adds a new argument for sample (-d INT) that disables Smith-Waterman dynamically if the number of pair rescue attempts is > than INT.
without -d:
with -d (4000):
Do you guys find it useful?
Code here: https://github.com/drio/bwa/commit/250a88ccb30d782cefd34b4cd2f3c66bd2a023be
At the time of writing this comment, the pull request is still under review.
When the alignment not finished yet,how can we plot the insert size distribution? I'm curious,and humbled for your teaching~
Run sampe with -s to disable SW.