Levenshtein Distance Vs Smith-Waterman Algorithm
3
1
Entering edit mode
13.2 years ago
Bioscientist ★ 1.7k

Edit distance is the minimization of sequence differences, so more suitable for short highly-similar sequence alignment?

And Smith-Waterman is the maximization of alignment, so better for longer string search, say in BLAST?

WHat's the difference?

alignment • 21k views
ADD COMMENT
10
Entering edit mode
13.2 years ago
Fabian Bull ★ 1.3k

Your are confusing something:

The Levenstein distance is a mathematical distance. Its equal to the number of mutations that would make bouth string equal. It doesn't give you any alignment. Its just a measurement for how diverse two strings are.

The Smith-Waterman algorithm is (as the name suggestes) an algortihm. Its independent from any distance function. It calculates an alignment that minimizes the costs given by certain distance function. Its aim is to align two sequences in a way that similar subsequences are aligned together.

BLAST however is a heuristic that gives you an approximation of the best local alignment.

Before you ask: If you want to align entire sequences you should look for Needleman-Wunsch algorithm. Its similar to Smith-Waterman but tries to align the whole sequences together.

ADD COMMENT
0
Entering edit mode

Levenstein distance is not a distance metric, hence you can't call it a "mathematical distance".

ADD REPLY
0
Entering edit mode

Actually, Levenshtein distance is a metric (identity, symmetry and triangular inequality hold).

ADD REPLY
0
Entering edit mode

Yes, you are right.

ADD REPLY
4
Entering edit mode
13.2 years ago

Levenshtein is an algorithm computing the global distance between two strings.

Smith-Waterman is an algorithm finding the best local sequence alignment:

ADD COMMENT
4
Entering edit mode

this answer kind of misses the point. had he said Needleman-Wunsch in his question what would your answer be?

ADD REPLY
1
Entering edit mode

the levenstein distance doesnt compute anything but there are algorithms to compute the levenstein distance.

ADD REPLY
0
Entering edit mode

fixed .

ADD REPLY
0
Entering edit mode
13.2 years ago

http://en.wikipedia.org/wiki/Needleman–Wunsch_algorithm

Needleman and Wunsch formulated their problem in terms of maximizing similarity. Another possibility is to minimize the edit distance between sequences, introduced by Vladimir Levenshtein. Peter H. Sellers showed[5] in 1974 that the two problems are equivalent.

ADD COMMENT
0
Entering edit mode

While reading the wiki levenshtein article I didnt saw any exchangeable distance function. The needleman-wunsch algorithm has an exchangeable distance function. So how is it possible that both problems are equal?

ADD REPLY
0
Entering edit mode

what does exchangeable distance mean? They are equivalent because penalizing mismatches and gaps has the same effect as rewarding matches. Didn't you see that Peter Sellers movie?

ADD REPLY
0
Entering edit mode

Who is Peter Sellers? But: I didnt know gap-costs and penalties are parameters of the algorithm for levenshtein. Because you want the distance to be the number of mutations. If you can modify: you are completly right.

ADD REPLY
0
Entering edit mode

"allowable edit operations being insertion, deletion, or substitution of a single character"

ADD REPLY
0
Entering edit mode

So who is Peter Sellers?

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2283 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6