Question

Cons of Smith-Waterman Alignment

0

Entering edit mode

3.5 years ago

Student ▴ 30

Hello.

I was reading this article about Whole-Genome Alignment and Comparative Annotation. I find difficult to understand this extract from the article that describes the cons of using Smith-Waterman algorithm for the alignment problem:

Another consideration is how genome rearrangements complicate the alignment problem. Smith–Waterman and Needleman–Wunsch both produce alignments that have fixed order and orientation; that is, insertions, deletions, and substitutions are the only allowed edit operations. When looking within short or well-conserved sequences, like genes, this requirement is usually fulfilled. But at large evolutionary distances and looking within a sufficiently large window, genomes almost always contain more complex rearrangements with respect to each other—inversions, transpositions, and duplications all cause breaks in order and orientation that cannot be captured under constant order and orientation.

My doubt is: what do they mean by "orientation" ?

Smith-Waterman Genomics Alignment DNA Sequences • 2.0k views

ADD COMMENT • link updated 3.5 years ago by German.M.Demidov ★ 2.9k • written 3.5 years ago by Student ▴ 30

score 1 · Answer 1 · 2021-05-23

1

Entering edit mode

3.5 years ago

Guillermo ▴ 20

Hello there!

Normally when people talk about gene orientation, they are referring to whether the gene is encoded on the positive or negative strand of DNA.

I found a complimentary video that may help: https://www.youtube.com/watch?v=JC6ew2xnJBA

ADD COMMENT • link 3.5 years ago by Guillermo ▴ 20

score 1 · Answer 2 · 2021-05-24

1

Entering edit mode

3.5 years ago

German.M.Demidov ★ 2.9k

Well, there are expected inversions and translocations in a genome. A piece that was oriented as ---===>----> in a genome of one organism may be oriented as ---<====----> in another organism of the same specie. Dynamic programming algorithms can not detect this, obviously.

ADD COMMENT • link 3.5 years ago by German.M.Demidov ★ 2.9k

0

Entering edit mode

I do not understand why dynamic programming algorithms can not detect this... for example, if I align a gene that has a sequence 5'->3' with an other that is 3'->5', Smith-Waterman algorithm is not able to do it ?

ADD REPLY • link 3.5 years ago by Student ▴ 30

1

Entering edit mode

You can check it here https://www.ebi.ac.uk/Tools/psa/emboss_water/ (put DNA):

ACGACACGTAGCAGCATGCAGCATCATACAGCATCACAGTCAGTTTCAGCAGCAAACTACAGT

and its reverse

TGACATCAAACGACGACTTTGACTGACACTACGACATACTACGACGTACGACGATGCACAGCA

The result:

EMBOSS_001         1 ACGAC--------ACGTAGCAGCATGCAGCATCATACAGCA     33
                     |||||        |||||..|..||||        ||||||
EMBOSS_001        31 ACGACATACTACGACGTACGACGATGC--------ACAGCA     63

Not really what we expect, true?

ADD REPLY • link 3.5 years ago by German.M.Demidov ★ 2.9k

0

Entering edit mode

ok thank you , I saw again the algorithm and in fact it makes sense to me that it does not detect this .

ADD REPLY • link 3.5 years ago by Student ▴ 30

1

Entering edit mode

yeap, I wanted to answer "see the algorithm itself", but then understood that you may come not from compsci background, so gave an example =)

ADD REPLY • link 3.5 years ago by German.M.Demidov ★ 2.9k