In the assembler HGAP, they used long and short reads or reads only long?
because they wrote in their article
we developed a consensus algorithm that preassembles long and highly accurate overlapping sequences by correcting errors on the longest reads using shorter reads from the same library
"using go short reads from the Saami library" long or short reads reads from the Saami library?
We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction.
The work flow as follow
A workflow to first preassemble reads, assemble the preassembled reads using Celera® Assembler, then polish using Quiver.
The longest reads are selected as 'seed' reads, to which all other reads are mapped.
as you can see in the figure the longest read in blue first used to map short once to it that is called the preassembles then they used this preassembled to form assembly as you can see in the second part of the figure , but in your consideration that The best de novo assembly results are generated by HGAP if at least 60X-100X coverage generated
Thank you for your reply.
Can you explain the term "seed reads"?
Thank you