I try to Understand Global alignment Score with the following example- Is the solution right or false? If you know that sequence (A) was 200bp and (B) was 150bp and they match exactly in a 100bp stretch, and you had this scoring system: match 2, mismatch 1 and gap -1. a) What would be global alignment score?
The Solution :-
S=∑(identities, mismatches) - ∑ (gap penalties)
S= ∑ (100+150)- ∑ (-50)
S= 100
Score = Max (S)
b) If you did local alignment, do you expect the score to be higher or lower from the one above? Answer : The score will be higher with local alignment.
S := ∑(identities, mismatches) + ∑ (gap penalties) S= 100+150+(-50) = 200 if penalties are given by negative values. Strange that a mismatch receives a positive score.
That seems correct, but why?
That's false. Actually, the local alignment score will be at least as high as the global alignment score.
Furthermore, with this particular score scheme, it's not possible for an optimal local alignment to be higher than an optimal global alignment, because any trimmed base will strictly decrease the alignment score (since mismatches are positive). Therefore, the optimal local and global alignment scores will always be identical.
I think that local and global alignment scores may differ. If they do, local alignment score will always be higher than global. For example, in this simple case:
Global alignment score: 6
Local alignment score: 9
How can you have a trailing deletion with no anchoring base? That doesn't make sense. In that case, the global alignment is not optimal; the optimal global alignment is the same as the local alignment.
Although, it's possible our definitions of global alignment differ. I'm considering the case of aligning a query sequence to a reference sequence (in which case you do not require all reference symbols to be consumed), rather than aligning two sequences that you expect to be the same length (like genes from different species) and requiring all symbols in both the reference and query to be consumed.
I think it is quite likely you have a different definition of global alignment? Afaik, global means that both sequences are aligned in their entirety, that means from start to end. If one sequence is shorter than the other, gaps have to be introduced. There is no 'soft-clipping' nor is there a distinction query vs. a reference sequence in global alignment. I think you are referring to semi-global or 'glocal' alignment.
Hmm, interesting. Yes, I've always referred to global alignments in the context of aligning a query to a reference, in which all query symbols must be consumed, but not all reference symbols (in other words, the query is unclipped). Indeed, looking at Wikipedia's definition, it appears that what I thought of as global is generally called "glocal".