I am using BWA mem to align a bunch of 40 mers to itself, and find for different datasets it gives different outputs. Hope someone can help to explain how to choose parameters to control this? I take two 40mers as an example:
I have two 40mers with 2 mismatches:
>S37474_17
NNNNNAGGTGTTTTTACCTTTCTCAGCATTCCACAAGTTACTTCTNNNNN
>S1536_2
NNNNNATGTATTTTTACCTTTCTCAGCATTCCACAAGTTACTTCCNNNNN
I align using command bwa mem -k10 -a -D 0.001 -c10000 -A1 -B1 -O20 -E20 -L100 -T 30
, and it output the alignment between these two 40mer with cigar 50M
.
When I put other kmers together with these two kemrs, and align using the same parameters, in the output there is no alignment between these two kmers. There are several kmers with only 1 mismatches against these two kmers, so I think maybe bwa choose the better ones? Is there any parameters to control this? Thank you.
P.S. Is it possible bwa try to find the SMEM, and miss the potential hit, and the re-seed procedure also miss it?
Is there a way to let BWA report the alignment, not only report the best one? I use
-a
-D 0.001
and-T30
try to control this. There is only 2 mismtach, and penalty for mismatch (-B1
) is set to 1, so the final score should larger than 30.note that bwa is not a generic aligner that can produce all alignments within a given score. you will need to use different tools for that, for example lastz