Entering edit mode
8.2 years ago
nchuang
▴
260
Hey guys,
I wrote a script in python to call nucmer to align a query sequence with a 10k nucleotide sequence. The query sequence size varies from 20 bases to as large as 2000 bases. It seems like nucmer will find a match with the larger sized queries but it can't with smaller. Is this just not what nucmer/mummer is designed to do and I should just stick with using Smith-Waterman?
Thanks!
I would try MAFFT7.
http://mafft.cbrc.jp/alignment/software/algorithms/algorithms.html
" Updated! (2015/Jun) Parameters for E-INS-i have been changed in version 7.243. The new parameters work better for aligning a set of long sequences and short sequences that are closely related to each other. To disable this change, add the --oldgenafpair option.
With the new parameters, E-INS-i may be able to align multiple cDNAs and multiple genomic sequences of a gene from closely related species. However, it consumes large memory space when the sequences are long. "
oh I guess that's another option. I was hoping to keep all the code I wrote for NUCmer reading/parsing. I liked how Mummer spits out a coordinate file.