It is hard because biologists want to be on the cutting bleeding edge. That also means biologists are willing to take the pain of learning about computer programming languages and tools that are more than point-and-click
ones. In return (for long run) they save time, and understand deeply how the tools work. With that said, I think big C++-based tools are likely written by bioinformaticians who originate from computer science background. They choose the language because C++ historically
came to offer additional features on top of the fastest C
language (ignore assembly
for now). C++
offers object oriented programming approach so when the software get more complicate it is easier to manage, just like Java. However, C++ is still more lightweight than Java, and because it closer to C, it tend to be faster.
You mention distributed system, it is where java still shines over C++ (AFAIK), especially via Scala
language where Java is reduced to be the JVM. C++ was not developed for distributed purpose. There is a newer language, also very close to C, which is D
(surprise name!). I think it will be a good choice for anyone want to invest for learning a new language. It has cleanliness of a modern language, speed of C/assembly, support concurrency, and whole lot of other buzz words. I imagine D will beat C++ and Java away if it get into the big hands. Last time I heard, Facebook and Google is already taking up D. So the time will come.
Bottom line, C++ is still here for speed, legacy code which exists in many areas outside of bioinformatics, and legacy (retention) of education.
Just my quick (maybe biased, and unchecked) opinion.
This type of "blanket" statement isn't exactly true. There are a considerable number of situations in which well-written C++ code can still outperform Java code, and in the context of many genomics problems, this speed difference matters. However, I might argue that an even more important and compelling reason for choosing C++ over Java is that C++ applications can (a) provide much greater control over memory usage and, as a result can (b) have substantially reduced memory requirements compared to similarly performing Java programs. Java's memory management strategy (garbage collection) may work well for many general workloads. However, when you have very particular allocation patterns that you know about in advance (as is commonly the case in genomics applications), you can accommodate these easily in C++, but it's difficult (sometimes impossible without native extensions) to force Java to do what you want. I agree that the general lack of ability to have (or make) statically pre-compiled, broadly compatible C++ programs is a real pain. However, for a host of different applications, the performance characteristics of C++ are just too appealing to pass up.
I strongly agree with Rob's comment about the great control over memory coming with c++. Moreover, I am not sure at 100% that java takes advantage to use specific hardware features as SSE registers for instance (which allows to vectorize some algorithms with great speedups)
Note that I am a great fan of java though. My point of view is that algorithms requiring great performances (both in execution time and low memory usage) should be written in C/C++ as a (dynamic) library, with a potential frontend as a command line. Then, there are possibilities to call some native functions of this library from another langage like java (through JNI) or other langages like python, perl, ... (with SWIG for instance).
This is one of the benefits of things like bioconda. You don't have to deal with compiling binaries yourself then (and yes, there's much more than python available in there).