Entering edit mode
7.6 years ago
dmathog
▴
40
Has anyone ever encountered a program which aligns two sequences and then merges them by following the longest path?
That is, when it comes to an indel, it always uses the insertion rather than the deletion.
This would be used more or less like EMBOSS megamerger, except that program works by crossing over between designated "front" and "back" sequences, and this one needs to follow a more complex path through the alignment array.
I can see ways to edit the output of something like "ggsearch" into the desired result, ie:
300 310 320 330 340 350
seq1 agagagagagaaagaaagaaagaaagaagagaaagacacatatagatatatagagagaga
:::::::::::::::::::::::: :::::::::::::::::::::::::: :::::
seq2 AGAGAGAGAGAAAGAAAGAAAGAA----GAGAAAGACACATATAGATATATAGATAGAGA
430 440 450 460 470
becomes
seq12 AGAGAGAGAGAAAGAAAGAAAGAAagaaGAGAAAGACACATATAGATATATAGATAGAGA
but was hoping for a prepackaged solution.
Thanks.
how about writing a script to parsing the output, to compare every bases of every locations in both sequences.
That's pretty much what I ended up doing, and wrote it in the general form for an MSA rather than just a pair of sequences. The algorithm is this: