I'm curious about how the mapping quality is calculated in those sequence mappers. So far I haven't found a really explicit explanation about this.
For Bowtie2 I found this in the mannual:
Aligners characterize their degree of confidence in the point of origin by reporting a mapping quality: a non-negative integer Q = -10 log10 p, where p is an estimate of the probability that the alignment does not correspond to the read's true point of origin. Mapping quality is sometimes abbreviated MAPQ, and is recorded in the SAM MAPQ field
Maq: their description The calculation of mapping qualities is simple, but this simple calculation considers all the factors below:
The repeat structure of the reference. Reads falling in repetitive regions usually get very low mapping quality.
The base quality of the read. Low quality means the observed read sequence is possibly wrong, and wrong sequence may lead to a wrong alignment.
The sensitivity of the alignment algorithm. The true hit is more likely to be missed by an algorithm with low sensitivity, which also causes mapping errors.
Paired end or not. Reads mapped in pairs are more likely to be correct.
Though there are some conceptually explanation, neither provide how 'p' is calculated.
There is an SEQ page that contains some useful info. But still it just says:
Assuming your mapper puts something useful in here
Any idea about this?
See this blog post by Dr. Simon Andrews on this topic.
Thank you for the information. It helps.