Hi all,
First of sorry if the question is a basic for you. Could you please explain to me why the short reads are broken into shorter fragments (K-mer) and then found the overlap k-mer by the de Bruijn algorithm-based assembler software for transcriptome assembly, instead of using the entire read for finding the overlap segments? Please let me kindly know what is the benefit of such algorithms in relative to overlap algorithm used by CAP3?
Thanks
Time... It takes a lot more time to do it the old fashion way (with overlap), every read has to be checked with every other read.
If you like tutorials in-addition to papers Homolog-blog covers really cool things about denovo assembly in general. Take a look :)
This is a slightly older paper but in case you have not seen it, it would be useful.
You might file following papers useful.
Comparison of De Novo Genome Assembly Software
Sequence assembly demystified