Hello.
I was reading this article about Whole-Genome Alignment and Comparative Annotation. I find difficult to understand this extract from the article that describes the cons of using Smith-Waterman algorithm for the alignment problem:
Another consideration is how genome rearrangements complicate the alignment problem. Smith–Waterman and Needleman–Wunsch both produce alignments that have fixed order and orientation; that is, insertions, deletions, and substitutions are the only allowed edit operations. When looking within short or well-conserved sequences, like genes, this requirement is usually fulfilled. But at large evolutionary distances and looking within a sufficiently large window, genomes almost always contain more complex rearrangements with respect to each other—inversions, transpositions, and duplications all cause breaks in order and orientation that cannot be captured under constant order and orientation.
My doubt is: what do they mean by "orientation" ?
I do not understand why dynamic programming algorithms can not detect this... for example, if I align a gene that has a sequence 5'->3' with an other that is 3'->5', Smith-Waterman algorithm is not able to do it ?
You can check it here https://www.ebi.ac.uk/Tools/psa/emboss_water/ (put DNA):
and its reverse
The result:
Not really what we expect, true?
ok thank you , I saw again the algorithm and in fact it makes sense to me that it does not detect this .
yeap, I wanted to answer "see the algorithm itself", but then understood that you may come not from compsci background, so gave an example =)