Question

Best tool for finding short alignment between to sequences

0

Entering edit mode

7.6 years ago

juan.crescente ▴ 110

I have one long sequence (several kb) (I can cut it into pieces) and I want to find small parts that align (small repeats, from 10 to 500 bp long). What is the best algorithm to use? why? Preferably I want to use inside my python script.

It should allow mismatches and gaps with a penalty (maybe 1 mismatch in 10 bp, one gap).

I've tried pairwise2.align.localxx, but it gave me very long alignments as output.

alignment python • 2.7k views

ADD COMMENT • link updated 7.6 years ago by BioinfGuru ★ 2.1k • written 7.6 years ago by juan.crescente ▴ 110

0

Entering edit mode

To be honest this actually sounds like a motif analysis tool might do this.

For example, finding short enriched set of motif repeats in 1 long sequence....maybe try meme for this...you prob wont care for the TFs that bind, but it should find short enriched sequences in your long sequence .... then align those sequences with a multiple sequence aligner.

ADD REPLY • link 7.6 years ago by BioinfGuru ★ 2.1k

score 0 · Answer 1 · 2017-12-19

0

Entering edit mode

7.6 years ago

dariober 15k

blast and vmatch pop up to my mind. Both are good at finding subsequences that align between query and reference. vmatch in particular is very flexible. Since the output of vmatch is not quite in any standard format, I wrote a converter to SAM format (here).

ADD COMMENT • link 7.6 years ago by dariober 15k

0

Entering edit mode

the problem with BLAST is that it takes one sequence as query and one as database. I want to find small repeats / alignments between the two of thrm. I'll check vmatch

ADD REPLY • link 7.6 years ago by juan.crescente ▴ 110

score 0 · Answer 2 · 2017-12-19

0

Entering edit mode

7.6 years ago

BioinfGuru ★ 2.1k

Plenty of tools here:

https://www.ebi.ac.uk/Tools/psa/

ADD COMMENT • link 7.6 years ago by BioinfGuru ★ 2.1k