Entering edit mode
7.9 years ago
saranpons3
▴
70
Hello All, I would like to know that Why we need a distributed assembler based on MPI/Mapreduce when we can store Debruijn graph in a compact way using FM-index, Bloom filter and succinct graph structure techniques and assemble human genomes in a single node computer with less RAM (For example, Minia assembler(http://minia.genouest.org/) used only 5.7 Gb RAM for assembling human genome) ? Also, I would like to know that the assembly running time of distributed assemblers which are based on MPI/Mapreduce is better than FM-index, Bloom filter and succinct graph structure based assemblers?
Who told you that you need a distributed assembler? As far as I know, there are very few tools based on Map Reduce available at the moment.
Some of the earlier assemblers didn't yet use de Bruijn graphs the more memory-efficient methods. Since memory on cluster nodes is often not that high, people tended to use MPI so they would have access to more memory. I don't know of any that used map-reduce though. Anyway, with the advent of minia and such this isn't really needed any more.