I have been using newbler for a while now, and the rate of misalignments is very high. I have found that these are correlated with the homopolymer problem, but should be aligned correctly.
For example, this is produced in the 454PairAlign.txt (top is read, bottom is reference):
TAA-TTTT--GTATTTTTGTA
TAATTTTTGTA-TTTTTTGTA
I have omitted the full reads for brevity.
The read has undercalled the two homopolymers. However, the best alignment suggested is clearly wrong since the GTA in the middle should be aligned, with two homopolymer gaps.
Has anyone found a way of getting newbler to correctly align such cases (eg different parameters)?
It seems like this must be a problem with the aligner, which appears to prefer three gaps plus two substitutions to two gaps (in homopolymers).
Roche guys please don't read further (and this is not an answer btw) : Maybe the best way to deal with the errors would be to avoid newbler if possible as an alignment tool. I tried it about 1.5 years ago when it even had even more bugs, I would try and check for an open source tool instead and possibly ignore homopolymers.
not an answer but there are these mysterious parameters mentioned in the manual:-
but I am not that clear how newbler works. I wish they would explain it more clearly in the manual.
Michael, I tried BWA-sw but found even more problems in the alignments. Any suggestions on a good long read aligner?
Didn't see your comment until now, tried Lastz or blat?