Question

Local alignment in biopython

0

Entering edit mode

9.5 years ago

tinysnippets ▴ 40

Hi there, folks! I have a misunderstanding about biopython's local alignment function (localxx from here). So I have a rather short sequence of DNA - about 300 bp, lets call it S. And I have a large set of short (about 75 bp) seqs that considered as candidates to be Ss prefix, lets call this guys R. What Id like to do is to align R members to S by local alignment algorithm. Lets say S is "PREFIXPART_LOOONGSUFFIXPART" and R member is "PREFIX" it yields following

>>> for res in pairwise2.align.localxx("PREFIX", "PREFIXPART_LOOONGSUFFIXPART"): print(res)
('PREFI-----------------X----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PRE----------------F-IX----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PREF-----------------IX----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PRE-----------------FIX----', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 23)
('PREFIX---------------------', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 6)

As you can see it actually works as expected for this two strings, but in my case it refuses to see, that prefix actually aligns to template one - to - one (or with very minor modifications) and instead of getting "compact" alignments I get very distributed ones. How to handle this situation? Should I use some kind of score matrix or something?

sequence alignment • 2.6k views

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 9.5 years ago by tinysnippets ▴ 40

score 3 · Accepted Answer · 2016-02-03

3

Entering edit mode

9.5 years ago

Eric T. ★ 2.9k

The function localxx aligns the sequences with no gap penalties, but you do want to penalize gaps. You could try localxs instead:

>>> for res in pairwise2.align.localxs("PREFIX", "PREFIXPART_LOOONGSUFFIXPART", -1, -1): print(res)
('PREFIX---------------------', 'PREFIXPART_LOOONGSUFFIXPART', 6.0, 0, 6)

ADD COMMENT • link 9.5 years ago by Eric T. ★ 2.9k