Assume a read aligner (such as e.g. BWA
) is fed with a pattern P = GACT
. Now, the text that BWA
has preprocssed is NNNAGTCNNN
. The aligner would not find the pattern, and then would fall back to reverse-complementing it to AGTC
and searching again. The question is, what it would report in the SAM
-file? Would it output POS = 3
, CIGAR=4M
and FLAG
with a bit 0x10
set?
EDIT: To make the question complete and unambiguous: what if the text indexed is T = NNNAGTTNNN
, so that actually we have to align the reverse complement with error? Is it so that CIGAR = 3M1S
?
Although this is an Q&A place rather than a forum, still I've updated my question to be more comprehensive and complete. Thank you.
I tried. See my edit.
Ah, something has short-circuited in my mind and I was assuming S is for "substitution". Now fixed.