Im trying to implement Neeedleman-Wunch with affine penalty model.
According to Algorithm in Bioinformatic: Practical Introduction (Slides) I have to initialize four tables and back-trace them,
to do back-tracing, another table is initialized and filled up with the help of a maximum function, which returns the maximum value to be added to table V(i,j) plus the associated figure ( \, | or -) to be added to back-tracing table
this approach works with simple queries, but once the the strings become complicated, things go out of control. like the following example:
Target: TGCTAGTATAAACCTTATGGTATCTGCAGCAGAGGTTTCTTTAATCTCTCAATAGTAGATGCTTTGAAAC
Read: TTATCTATAATTTGGTATTGTAATGACAGTTTGTGTTTGGTTTTTTCTTCAGTAT
Alignment:
TGC-TAG-TATAAACCT-TATGGTATCTG-CAGCA-GAGGTTTCTTTAATCTCTCAATAGTAGATGCTTTGAAAC ---TTATCTATAATTTGGTATTGTAA-TGACAGTTTGTGTTTGGTTTTTTCT-TCAGTAT---------------
score = 46
while using EMBOSS-needle
TGCTA-GTATAAACCTTATGGTATCTGCA--GCAGAG-----GTTT-CTTTAATCTCTCAATAGTAGATGCTTTGAAAC --TTATCTATAA----TTTGGTAT-TGTAATG-ACAGTTTGTGTTTGGTTTTTTCT-TCAGTAT---------------
score = 52
My question, is this the right approach to implement NW with affine gap penalty model ?
here is my implementation in C++ Github.
int max(int diagonal, int vertical, int horizontal, char *figure) {
int max = 0 ;
if( diagonal > vertical && diagonal > horizontal )
{
max = diagonal ;
*figure = '\\' ;
}
else if (vertical > horizontal)
{
max = vertical ;
*figure = '|' ;
}
else
{
max = horizontal ;
*figure = '-' ;
}
return max ;
}
This is the standard approach (not checking your code though). The difference between your implementation and EMBOSS could be due to the difference in scoring. Did you use the same scoring matrix, gap open and gap extension penalties in both cases ?
yes, I do use DNAFULL scoring matrix and the same values for gap open and gap extension.
So it could be something in your code. Compare it to other implementations e.g. in C, C#, java, perl.
thanks a lot Jean :))
Hello mfb.bioinfo!
It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/38831040
This is typically not recommended as it runs the risk of annoying people in both communities.