Question

Phred quality Score

0

Entering edit mode

11.2 years ago

stat.1405 ▴ 30

if this is the quality score Q= -10 log (P)

when P is the probability that the variant is wrong. or the call is wrong.if P =0.01 then the Q=20, P=0.001 then Q=30 and so on.

the question is :

1-. how they calculate P?

P= 10^-Q/10 how it is derived from the above??

I read this paper to understand the calculation of P but its complicated.

alignment next-gen SNP • 9.3k views

ADD COMMENT • link updated 3.9 years ago by Ram 45k • written 11.2 years ago by stat.1405 ▴ 30

Ram · Answer 1 · 2014-05-21

0

Entering edit mode

11.2 years ago

Devon Ryan 105k

That will depend completely on the tool. For open source tools, see the source code/papers. For base calls you might have to ask Illumina (unless they've published the algorithm)
It's important to know that log() is log10() here (I realize that this is ambiguous), so it's just simple algebra.

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 11.2 years ago by Devon Ryan 105k

0

Entering edit mode

for (2), I did arrived to that -Q/10= logP, then I cannot make it as what is

should be e^Q/10=P

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 11.2 years ago by stat.1405 ▴ 30

0

Entering edit mode

Yeah, it's unfortunate that log() is such a common function but also has an ambiguous base. Sometimes it's log10() (like here), other times it's ln(), still other times it's log2() (I'm sure some group uses yet another base).

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 11.2 years ago by Devon Ryan 105k

0

Entering edit mode

Also, you forgot a negative sign

-10*log10(P)
^

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 11.2 years ago by Devon Ryan 105k

Ram · Answer 2 · 2014-05-21

As simple as this problem is there are plenty of gotchas. First the Phred Score was originally defined to quantify error, not correctness. In other words a high Phred score means there is a low chance of error. Widely used programs (not to be named) Phred scale probabilities of something being correct. Ask yourself when un-Phreding: Is the probability representing correctness or incorrectness.

The probability of the base or genotype being incorrectly called can be computed by:

10^(-phredScore/10)

Drop a couple of these lines into [R] to prove the correctness of the equation:

10^(-10/10) = 0.1 probability of error
10^(-20/10) = 0.01 probability of error
10^(-50/10) = 1e-5 probability of error

Summing it up watch our for Phred Scaled Likelihoods. They are not Phred Scores.