What is the best way to use single reads to close gaps (stretches of Ns) in a genome and combine contigs/scaffolds?
What is the best way to use single reads to close gaps (stretches of Ns) in a genome and combine contigs/scaffolds?
IMHO if you could not close the gaps automagically in the first round of assembly, even not with different (less stringent) settings like the minimal required coverage for a join of 2 contigs..... resassembling using the same set of data as single reads is not going to be worth all efforts.
If you want to close the gaps either fall back on classical PCR-Sanger sequencing for smaller gaps, or get yourself a paired-end or mate-pair run and do a combined assembly or try to fill the scaffolds with tools like IMAGE from Sanger (paper).
PacBio long reads sequencing, even with low coverage, could be a good solution for scaffolding if you have a de novo initial assembly of a bacterial genome by example.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Can you please be more specific? What data do you have? What is the end goal? Do you have additional sequencing to improve your assembly? Well defined problems tend to have the clearest answers.
Yes, I have a draft genome with a number of contigs, and now I have some additional sequences that I want to use to improve the assembly. I was thinking of using velvet, use the contigs as long reads and the additional reads that came out of sequencing as short reads, but maybe there are better ways out there to do this. Thanks in advance for suggestions.
more technical details: What platform are the new reads from? How long are they?