Hi,
I'm running Biopython 1.66 on Linux. In my current example I'm trying to align two string using the pairwise2 package I'm using a scoring matrix for the matching and simple scoring for gaps.
Here's the code
scoreMatrix = {('A', 'A'): 2, ('A', 'C'): -1, ('A', 'G'): -1, ('A', 'T'): -1,
('C', 'C'): 2, ('C', 'A'): -1, ('C', 'G'): -1, ('C', 'T'): -1,
('G', 'G'): 2, ('G', 'C'): -1, ('G', 'A'): -1, ('G', 'T'): -1,
('T', 'T'): 2, ('T', 'C'): -1, ('T', 'G'): -1, ('T', 'A'): -1,
('X', 'X'): 2,
('X', 'T'): -11,
('X', 'C'): -11,
('X', 'G'): -11,
('X', 'A'): -11,
('T', 'X'): -11,
('C', 'X'): -11,
('G', 'X'): -11,
('A', 'X'): -11
}
monomerAllignmentsGlobal = pairwise2.align.globalds(seq1, seq2, scoreMatrix, -1, -0.5)
The resulting alignment does not seem to be optimal:
CTCGGAGTCCCAGAGCCAAGGAGGCTCCCGCTGCCGGGCCCTGAGGCAGAAACCTCTCGGGCCGGGCGGACCCCTGTGCTCTCACCAGGAAG---------- |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| CTCGGAGTCCCAGAGCCAAGGAGGCTCCCGCCGCCGGGCCCTGAGGCAGAAACCTCTCGGGCCGGGCGGACCCCTGTGCTCTC--------AXXXXXXXXXX
(I apologize I don't know how to format the result better, copying it to a text editor might give you a better overview)
The last 'A' in front of 'XXXXXXXXXX' should ideally come before the last gap opening.
Is there a bug in Biopython or am I missing something?
Any help is appreciated.
Boris
Thank you Markus, upgrading to 1.68 did help.