bowtie2: -a option returns one valid alignment, but -k 1 returns none
0
0
Entering edit mode
2.6 years ago

I am confused by the behavior of the -k and -a options. I am attempting to align a set of short (23-31 bp) reads which may have up to two adjacent mismatches in them. Because of this, I am decreasing -L significantly (as low as 10) in order to still align those reads with 2 mismatches. I am aware that in many cases, this will cause the read to fail to be aligned because of the -D option. To circumvent this, I've tried to use the -k option to simply accept the first alignment that meets the score threshold (which should be tolerant to two mismatches). However, this is not working as expected.

Using the following read (which is expected to have a CC>TT mismatch at the 7th and 8th positions from the right-hand side):

@SRR3062593.2063075.1 HISEQ:224:C5LRDANXX:5:1311:8220:30771 length=50
GGTATTGCTGGAATTTCCAGCATTTGCCAG
+SRR3062593.2063075.1 HISEQ:224:C5LRDANXX:5:1311:8220:30771 length=50
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFF

And the following alignment parameters:

-L 10 -k 1 --mp 6,6 --score-min L,-12,-0.1

I get the following result:

1 reads; of these:
  1 (100.00%) were unpaired; of these:
    1 (100.00%) aligned 0 times
    0 (0.00%) aligned exactly 1 time
    0 (0.00%) aligned >1 times
0.00% overall alignment rate

However, when I run the same read with the -a flag instead of the -k 1 parameter, it aligns the read to exactly one site. Clearly, there is one (and only one) valid alignment for the read, so why isn't this found using -k 1?

Alternatively, I could circumvent this problem entirely if I had a way to allow 2 adjacent mismatches during seeding. If anyone is aware of a tool/method to achieve this, I'm all ears!

I am using the most recent version of bowtie2, v2.4.5

alignment bowtie2 • 658 views
ADD COMMENT
0
Entering edit mode

what happens when you pass neither flag (default?).

Multiple parameters may have unintended consequences, when you select -a the system may "try harder" and find an alignment looks "better" than the one found with -k 1. The order in which an alignment is found and a constraint is applied may not be the one we intuitively expect. I am not saying this is the cause for sure, but I have noted this type of behavior before.

ADD REPLY
0
Entering edit mode

When only the seed length is changed, the read does not align, assumedly because the default value for the -D parameter is only 15, and at a seed length of 10, the alignment easily fails that many times.

If what you describe is actually the case, then the manual is definitely misleading, as it states that when the -k parameter is used, "the search terminates when it can't find more distinct valid alignments, or when it finds <int>, whichever happens first." Additionally, "[The value for -D] is automatically adjusted up when -k or -a are specified".

Of course, I may just be misinterpreting something in the manual, but at the moment it seems pretty cut and dry...

ADD REPLY

Login before adding your answer.

Traffic: 5606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6