How Is The Word Score For The Look-Up Table Determined For Blast?
0
2
Entering edit mode
12.1 years ago
Niek De Klein ★ 2.6k

I'm going over some slides about BLAST from an introductory course of bioinformatics. I don't understand how the words for the look-up table are scored. The query sequence is split up in words of 3 letters. In the example the query sequence is QLNFSAGW. So the words are: QLN, LNNF, NFS etc. Then the score given for the words when found is (using BLOSUM table):

words 
from
sequence       Query words
----------------------------------------
QLN            QLN=11, QMD=9, HLN=8 etc
LNF             LNF=9, LBF=8, LBT=8 etc

However, when I look at the BLOSUM62 table Q-Q = 5, L-L = 4 and N-N = 6. So why is QLN 11 points and not 15 points? Same for QMD: Q-Q = 5, L-M = 2, N-D = 1, why is QMD 9 and not 8? How are the word scores calculated?

The NCBI handbook only mentions that it makes a look-up table of all the words, not how the words are scored.

BLAST works by first making a look-up table of all the “words” (short subsequences, which for proteins the default is three letters) and “neighboring words”, i.e., similar words in the query sequence.

blast • 2.7k views
ADD COMMENT
0
Entering edit mode

Are you sure you are using the same BLOSUM table? The the number that appears after BLOSUM tables (BLOSUM62, BLOSUM80, BLOSUM45...etc) is the percent identity of the sequences used to generate the matrix. So higher number means less divergent sequences. Maybe you were using different BLOSUM number tables.

ADD REPLY
0
Entering edit mode

Oh I misread your question. Yes I'm sure I'm using the same one, at the bottom of the slide it says: Scoring is done using the BLOSUM62 amino acid exchange matrix33, and I used BLOSUM62 to calculate the scores. I'm going to calculate it using the other BLOSUMS now (I tried PAM250 but doesn't give the same numbers) to see if it matches a different one.

BLOSUM80 and BLOSUM45 also give different numbers.

ADD REPLY

Login before adding your answer.

Traffic: 2533 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6