Thanks for lots of answers and I've started with lots of sources here. It helps me to get a good introduction to blast.
After I've now a little more understanding I can ask more specific questions such as the following:
The following blast result:
ATTTGCAGAATTTGCAAAAAAATGTTTGT
||||||||||||||| ||||||||||
ATTTGCAGAATTTGC----AAATGTTTGT
I'd like to calculate the raw alignment score: (Scoring scheme: match=1, mismatch= -2, opening gap= 3, extented gap=2)
We have 25 matches and 4 gaps. S = 25-9 = 16 or do I need to say: there are 4 gaps which are 4 mismatches and therefore we get the following raw score: S = 25-9-8= 8
Which one is correct?
Thanks! Beeth.
Note that BLAST is bit more complicated than this, and scales match/mismatch scores by the nucleotide's frequency in the data set (i.e. if A's are more frequent in general than G's, then matching a T to an A is going to be cheaper than matching it to a a G). Isn't Karlin-Altschul also doing some adjustment for sequence lengths?