I have downloaded kmergenie 1.7038 and attempted to compile it on (1) Ubuntu 14.01, (2) a cluster which I think is based on Suse Linux, and (3) Mac OS X (10.10.5). The compilation instructions are very simple ("make"), but have failed on all three platforms. The failures seems to be related to the bundled ntcard software while linking. On Ubuntu, the long string of 'undefined reference' errors and 'access beyond end' errors concludes with:
/usr/bin/ld: ntcard-ntcard.o: access beyond end of merged section (36032) /usr/bin/ld: ntcard-ntcard.o(.debug_info+0x69e5): reloc against `.debug_str': error 2 /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: error: ld returned 1 exit status make[2]: *** [ntcard] Error 1
The errors superficially do not look the same on different platforms. In order to check ntcard itself, I have downloaded and compiled that application separately (no errors).
I would be grateful for any suggestions of how to get this to compile. Thanks!
I don't believe that KmerGenie has a solid theoretical ground for its claims.
Why? I don't know. It doesn't make any sense to me.
However, BBMap has a tool called TadWrapper that will rapidly do assemblies at various kmer lengths and tell you which assembly actually had the best contiguity. You can use it like this:
Will that tell you the exactly optimal kmer length for the assembler that you eventually plan to use? No, that's impossible; the only way to do that is to assemble at multiple kmer lengths with the actual assembler you will use. But, it will give you a very close approximation, since it actually does an assembly with that kmer length.
If you do want to follow KmerGenie's approach and find out which kmer length yields the maximal number of unique kmers, you can do that with BBMap's "kmercountmulti.sh" tool, which is extremely fast. But I don't recommend that.
BBMap is already compiled, so you just unzip it and it will work as long as you have Java installed.
Thank you for your suggestions, Brian. I will definitely look into it. /T
Hi Brian,
The theoretical foundations of kmergenie can be found in Section 2 of our article. Please feel free to email us if any detail was unclear there.. This article was published in 2013 but I continue to believe that the theoretical grounds there still hold for past and current Illumina single-k genome assemblies ;)
Rayan