Dear all,
I installed Needleman-Wunsch module, however have some problems to handle it. Could anyone help to write a simple example to illustrate how to use it? My second question is: two sequences could align in different forms, my work actually only needs the best alignment, however I'd like to know how to output the other forms. Where is the parameter to do this? MY third question is: From literature I know the score normally is set as follows, match:1, mismatch:-1, gap:-2, then what's gap extention and gap open? WHat's your favorite score scheme? Note I am aligning nucleotide sequences.
I will much appreciate your help if you answer any of my three questions. THANK YOU!!
SYNOPSIS
use Algorithm::NeedlemanWunsch;
sub score_sub {
if (!@_) {
return -2; # gap penalty
}
return ($_[0] eq $_[1]) ? 1 : -1;
}
my $matcher = Algorithm::NeedlemanWunsch->new(\&score_sub);
my $score = $matcher->align(
\@a,
\@b,
{ align => \&on_align,
shift_a => \&on_shift_a,
shift_b => \&on_shift_b,
select_align => \&on_select_align
});
DESCRIPTION
Sequence alignment is a way to find commonalities in two (or more) similar sequences or strings of some items or characters. Standard motivating example is the comparison of DNA sequences and their functional and evolutionary similarities and differences, but the problem has much wider applicability - for example finding the longest common subsequence (that is, "diff") is a special case of sequence alignment. ... ...
METHODS
Standard algorithm
new(\&score_sub [, $gap_penalty ])
The constructor. Takes one mandatory argument, which is a coderef to a sub implementing the similarity matrix, plus an optional gap penalty argument. If the gap penalty isn't specified as a constructor argument, the "Algorithm::NeedlemanWunsch" object gets it by calling the scoring sub without arguments; apart from that case, the sub is called with 2 arguments, which are items from the first and second sequence, respectively, passed to "Algorithm::NeedlemanWunsch::align". Note that the sub must be pure, i.e. always return the same value when called with the same arguments.
align(\@a, \@b [, \%callbacks ])
The core of the algorithm. Creates a bottom-up dynamic programming matrix, fills it with alignment scores and then traces back to find an optimal alignment, informing the application about its items by invoking the callbacks passed to the method.
The first 2 arguments of "align" are array references to the aligned sequences, the third a hash reference with user-supplied callbacks. The callbacks are identified by the hash keys, which are as follows:
align
: Aligns two sequence items. The callback is called with 2 arguments, which are the positions of the paired items in "\@a" and "\@b", respectively.shift_a
: Aligns an item of the first sequence with a gap in the second sequence. The callback is called with 1 argument, which is the position of the item in "\@a".shift_b
: Aligns a gap in the first sequence with an item of the second sequence. The callback is called with 1 argument, which is the position of the item in "\@b".select_align
: Called when there's more than one way to construct the optimal alignment, with 1 argument which is a hashref enumerating the possibilities. The hash may contain the following keys:align
: If this key exists, the optimal alignment may align two sequence items. The key's value is an arrayref with the positions of the paired items in "\@a" and "\@b", respectively.shift_a
: If this key exists, the optimal alignment may align an item of the first sequence with a gap in the second sequence. The key's value is the position of the item in "\@a".shift_b
: If this key exists, the optimal alignment may align a gap in the first sequence with an item of the second sequence. The key's value is the position of the item in "\@b".
All keys are optional, but the hash will always have at least one.The callback must select one of the possibilities by returning one of the keys.
All callbacks are optional. When there is just one way to make the optimal alignment, the "Algorithm::NeedlemanWunsch" object prefers calling the specific callbacks, but will call "select_align" if it's defined and the specific callback isn't.
Note that "select_align" is called instead of the specific callbacks, not in addition to them - users defining both "select_align" and other callbacks should probably call the specific callback explicitly from their "select_align", once it decides which one to prefer.
Also note that the passed positions move backwards, from the sequence ends to zero - if you're building the alignment in your callbacks, add items to the front. ... ...
I think you are asking a lot here, I would recommend limiting the scope of your post to one or two questions and try to provide some relevant information about the errors (not just copying all the documentation). What exactly are you trying to do and what problems did you have with the module? Also, what do you want as output, the alignment or the scores, or something else maybe?
Hi SES, thanks for your advices. I clicked ADD COMMENT or ADD REPLY, but did not work (I tried IE and Firefox), I have to use this answer box. Regarding to my original question, I mostly concern how to use this module. Could anyone give me an example of using it? For instance,
$seq1 = 'ATGCATGCATGC' $seq2 = 'ATGGATCGAGC'
. I am new in perl, especially handle modules. Thank you very much!