So, I ran a blastp query online and downloaded the output file.
I fetched gap-open and gap-extend values (a = 11, b = 1), that gave alpha = 1.9, beta = -30 from BLOSUM62. Then I computed length adjustment by solving the fixed point equation (using blast_stat.c's routine). length adjustment, l = 129.
I picked one HSP with score S = 79 (< Hsp_score>79< /Hsp_score>).
I picked lambda, K, N, n from following section:
< Statistics>
< Statistics_db-num>466914</Statistics_db-num> : N
< Statistics_db-len>175697176</Statistics_db-len> : n
< Statistics_hsp-len>0</Statistics_hsp-len>
< Statistics_eff-space>0</Statistics_eff-space>
< Statistics_kappa>0.041</Statistics_kappa> : K
< Statistics_lambda>0.267</Statistics_lambda> : lambda
< Statistics_entropy>0.14</Statistics_entropy>
< /Statistics>
m = query length = 1286.
Then I computed evalue using this formula: evalue = K * (m - l) * (n - N*l) * e^(-lambda * S), and I get 3.7843. But, the output file gives me 3.35488. I tested for different hsps, I never get the same value as the output file.
Am I doing anything wrong, or the formula changed subtly? Am I reading m wrong; should m change depending on HSP length, but not the query length?
The scoring method got changed in version 2.2.26. In stead of old length adjustment, an "improved finite score correction" method is used. "An improved finite size correction is now used for blastp/blastx/tblastn/rpsblast." (https://www.ncbi.nlm.nih.gov/books/NBK131777/#Blast_ReleaseNotes.BLAST_2_2_26_January).
The method is described in the paper titled, "New finite-size correction for local alignment score distributions". I am yet to find where in the code this actually happens.