How is the 2nd best hit found in bwa-mem
0
0
Entering edit mode
6.9 years ago
IP ▴ 770

Hi biostars!

I have a question about how is the 2nd best hit found in bwa-mem for calculating the mapping quality. From the bwa paper, it is stated that the mapping quality scores are calculated using a similar algorithm as in the MAQ aligner. In the MAQ paper one can found the following formula for the calculation of mapping quality scores:

mapq formula

Copying and pasting from the paper:

"q1 is the sum of quality values of mismatches of the best hit, q2 is the corresponding sum for the second best hit, n2 is the number of hits having the same number of mismatches as the second best hit, k is the minimum number of mismatches in the 28-bp seed, q is the average base quality in the 28-bp seed, 4.343 is 10/log10, and p1(k,28) is the probability that a perfect hit and a k-mismatch hit coexists given a 28-bp sequence that can be estimated during alignment."

My question comes situations as in the following example:

Lets imagine that we have a illumina sequenced read (100bp), that has a perfect alignment to the position chr1:100000-100100. Then, the second best alignment will be in a lot of cases those alignment positions +1 (chr1:100001-100101.). Which in the top of my head does not makes sense to consider for the 2nd hit

Other implementations of the Farrar Smith-Waterman algorithm find the 2nd best hit by masking the 1st hit. Does anybody know how this is handled in bwa-mem?

bwa-mem • 1.8k views
ADD COMMENT
0
Entering edit mode

Did you just ... tag Heng Li? Could have just mentioned him, you know? lh3

ADD REPLY
0
Entering edit mode

ops, that was my idea, thanks :)

ADD REPLY

Login before adding your answer.

Traffic: 1700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6