I am working with Oxford Nanopore Minion data for small genomes that I am trying to assemble with de novo assembly tools. For training, I have a few datasets with reference genomes and have been comparing various de novo assembly tools. So far I have the best performance from Unicycler, but have not been able to find much information on polishing or otherwise handling multiple separate contigs when one long contig is desired. Sometimes the same assembler tools will output 1 contig, and other times they will output many separate contigs - even though there is enough of an overlap to hypothetically connect these separate contigs.
I completed some genome polishing tutorials such as with NanoPolish, but realized that they may not do what I want: combining separate contigs into one draft genome sequence. What are the designated tools to accomplish this task? Should I expect to do it manually with a visualization or mapping tool? Is alignment or MSA helpful for this task?
Additionally, is there a reason why state of the art assembly tools are unable to complete these assemblies manually (into a single contig that is)? I do not believe I have any unsequenced stretches, since my genomes are so small.
The vast majority of genome assemblies deposited to e.g. GenBank do not include complete chromosomes as continuous sequence. Is there some particular reason why contigs aren't good enough for you?
If you have related genomes, potentially ref based scaffolding tools like this are useful.
https://github.com/malonge/RaGOO
https://github.com/combogenomics/medusa