does anyone know, how ELAND computes mapping quality or, more importantly, if it's a proper error probability rather than some sort of alignment score. I have to deal with ELAND aligned samples and couldn't find any documentation regarding this.
Well I will tell you what I think, nobody knows what mapping quality should be and most aligners use values that are very subjective and usually not documented at all. This is true for bwa, bowtie etc. The way you can tell this is that when you summarize the mapping qualities in a SAM file there will be maybe 10 different values at most instead of covering the entire spectra of probabilities.
So it is situation where the meaning of the field is a well defined numerical concept, it is a probability of mapping it correctly, but the actual value that mappers will fill in is a purely ordinal (ranking) meaning that the mappers trusts one alignment more than the other. And of course is no correct value that you could check against.
Isn't the bwa algorithm based off of the statistical framework from the supplemental section of the Maq paper (I thought I saw Heng Li write that somewhere, but I guess I've never gone through the code)? That's at least a pretty reasonable approach, if so. For bowtie, yeah, it just outputs a vague approximation.
all from memory: I once looked through the code, and if I remember it correctly there were instances where mapq was computed with IF/THEN construct of the sorts: if there are two mismatches then set it to be 17 (it is in an answer here on Biostar, but I could not immediately find it) - of course I may actually be wrong about it in which case, I need to validate my opinion - I'll try to look up the source
Hi Istvan, I'm aware of the fact the mapping quality is usually not interpreted as probability (even though it should be; what's the point of using a phred scaled score otherwise?). Exceptions to the rule seem to be BWA-PSSM and Stampy as far as I can tell. Nevertheless, since I couldn't find any documentation for ELAND I was hoping that ELAND does something sensible and I thought I ask. Cheers, Andreas
ADD REPLY
• link
updated 5.2 years ago by
Ram
44k
•
written 11.1 years ago by
Andreas
★
2.5k
Isn't the bwa algorithm based off of the statistical framework from the supplemental section of the Maq paper (I thought I saw Heng Li write that somewhere, but I guess I've never gone through the code)? That's at least a pretty reasonable approach, if so. For bowtie, yeah, it just outputs a vague approximation.
all from memory: I once looked through the code, and if I remember it correctly there were instances where mapq was computed with IF/THEN construct of the sorts: if there are two mismatches then set it to be 17 (it is in an answer here on Biostar, but I could not immediately find it) - of course I may actually be wrong about it in which case, I need to validate my opinion - I'll try to look up the source
Hi Istvan, I'm aware of the fact the mapping quality is usually not interpreted as probability (even though it should be; what's the point of using a phred scaled score otherwise?). Exceptions to the rule seem to be BWA-PSSM and Stampy as far as I can tell. Nevertheless, since I couldn't find any documentation for ELAND I was hoping that ELAND does something sensible and I thought I ask. Cheers, Andreas