Hi,
I have a couple of fungal genomes that I'm reassembly from scratch as I didn't realise in a first time the amount of Illumina adapters still present in my reads.
I have assembled them with Velvet and iterate the k-mer length to get the optimum assembly based on the output parameters produced by abyss-fac. I recently read that the assemblies can be improved without further sequencing by at least a couple of different methods:
- map the reads against the assembly, extract the "properly paired" reads and reassembly them with the same kmer length. Take a look here
- Use specific software for this such REAPR(?)
I proceeded with #1 and while one of the genome re-assembly resulted in the exact same assembly parameters, the other changed quite a bit (top initial, bottom reassembly):
n |n:500 |n:N50 |min |N80 |N50 |N20 |E-size |max |sum |name
------ |------ |------ |------ |------ |------ |------ |------ |------ |------ |------
7290 |6860 |1122 |503 |4914 |11504 |22431 |14693 |124406 |43.73e6 |T2paper/velvet/k169/contigs.fa
10638 |8598 |1437 |500 |3980 |9021 |17417 |11669 |124406 |43.75e6 |T2paper/velvet/rek169/contigs.fa
So, the question...
- is this step common?
- Is there any easy way to compare them side by side or to evaluate the assemblies without relying in those numbers?
Thank you in advance,
Xabier
I think step #1 from the posted link does not refer to extracting the reads and reassembling them, but rather estimating the fragment length from Paired End reads, and then reassembling again with that information.
Edit: They do have this "Now we created a fairly good assembly, but lets see if we can do it better. Lets try to map the reads to the assembly and then only use mapped reads for another assembly.", but like others said, I don't think this will help the assembly.