Blast E Value Calculation
5
2
Entering edit mode
11.8 years ago
darxsys ▴ 240

I'm trying to work out the formula for BLAST e value calculation. I have surfed all across the internet looking for it, but the only thing I'm able to find is the formula provided here: http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html The problem is, I can't get the correct value for whatever parameters I plug into it. What I do? I run blastp module of blast on a query and Uniprot database. BLAST then gives me values for gapped K and lambda, as well as scores, database length and query length. When I plug that into the formula E = K * m * n * exp (- lambda * S), I get absolute nonsense. I tried various stuff for m and n - including database length, query length, result sequence length - tried bit scores and real scores, and I simply can't get the E value which blast outputs by itself, no matter what. Is there a way to correctly calculate E value and how? Please help.

blast • 25k views
ADD COMMENT
1
Entering edit mode

Is there some reason you need to calculate e-value for yourself? Seems like blastp has done it for you :)

ADD REPLY
0
Entering edit mode

Yes. I'm trying to find a way to calculate it because I need it in my project (if it is possible in any way).

ADD REPLY
1
Entering edit mode

Hi, did you get the answer for this questions. Thanks. If so,could you give some tips

ADD REPLY
0
1
Entering edit mode
11.8 years ago

bitscore is a normalized score calculated from the alignment which depends on the scoring system: (lambda * S - ln(k)) / ln(2)

The p-value of a blast alignment is basically: 1 / ( 2 ^ bitscore)

So if your bitscore is 10, you would need to score 2 ^ 10 alignments before you will get a score as good or better.

The e-value is just a p-value normalized to the database size: query length * database length * p-value = query length * database length / (2 ^ bitscore)

ADD COMMENT
0
Entering edit mode

Thanks for help, but still, I don't get the same result as blast. Okay, I'll give you the example on which I'm trying to do it in the hope you can get something out of it. I run blastp on Uniprot, with this query: Query= d1r5la1 a.5.3.1 (A:25-90) Alpha-tocopherol transfer protein {Human (Homo sapiens) [TaxId: 9606]} Length=66 I took that query from SCOP/Astral database. The first significant alignment I get from blast is for > sp|P49638|TTPA_HUMAN. and blast says this: Score = 136 bits (343) and Expect = 1e-39. Okay. When I plug in the stuff you told me to, I get: 1.448e-31. That is nowhere close the blast's value. Assuming database length is 191,240,774, K is 0.041, and lambda is 0.267. (Although I didn't need them right now.)

ADD REPLY
0
Entering edit mode

How did you get lambda? Some versions of blast don't use a constant lambda for all HSPs.

ADD REPLY
0
Entering edit mode

Gapped and ungapped values for lambda are given as part of the outpuf of blastp.

ADD REPLY
0
Entering edit mode

I think that recent versions of blastp use a slightly different stats model than the classical Altschul one. I faced similar discrenpancies between e-values from blastp and those I computed myself the same way you did.

What is your blast version ? If it is quite recent, you should try version 2.2.26 or 2.2.27 for instance (I guess they still use the old model). Otherwise, you could set the environment variable OLD_FSC to 1 before launching your blast command; it will switch back to the old model.

ADD REPLY
0
Entering edit mode

Thanks. I used the latest version. I try OLD_FSC = 1 and see what happen

ADD REPLY
0
Entering edit mode

How do you set OLD_FSC=1?

ADD REPLY
0
Entering edit mode

It's Protein-Protein BLAST 2.2.29+

ADD REPLY
0
Entering edit mode
8.3 years ago

E=pN where p is the p-value and N is the total length of the database divided by the length of the aligned database sequence

ADD COMMENT
0
Entering edit mode
8.2 years ago
Lhl ▴ 760

have a read of this http://homepages.ulb.ac.be/~dgonze/TEACHING/stat_scores.pdf

ADD COMMENT
0
Entering edit mode
8.0 years ago
midox ▴ 290

excuse me. here how can i determine the K and lambda values of the e-value formula please?

thanks

ADD COMMENT

Login before adding your answer.

Traffic: 1975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6