Question

News:How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications

1

Entering edit mode

7.9 years ago

Bohdan Khomtchouk ▴ 350

In collaboration with my colleagues, Edmund Weitz and Peter D. Karp, our lab has released a programming languages article about Lisp in bioinformatics and computational biology, now published in Briefings in Bioinformatics:

http://bib.oxfordjournals.org/content/early/2016/12/31/bib.bbw130.short?rss=1

I would be very grateful and honored to receive your collective feedback on this paper and welcome any comments/discussions.

Best wishes and Happy New Year!

Bohdan

ChIP-Seq RNA-Seq • 2.0k views

ADD COMMENT • link updated 19 months ago by Ram 44k • written 7.9 years ago by Bohdan Khomtchouk ▴ 350

2

Entering edit mode

It should also be emphasized that while the C or Java compiler is ‘history’ once the compiled program is started, the Lisp compiler is always present and can thus generate new, fast code while the program is already running.

This is true in Java as well...

If LISP is as performant as the paper indicates, I wonder why there seems to be a reluctance for developing performance-critical high-throughput sequencing bioinformatics tools in it, such as mapping and assembly programs? I can't imagine that it's bioinformatics library support; that mainly only matters for downstream analysis, statistics, and visualization packages. Or perhaps there are some I'm not aware of? Most of them seem to be written in C/C++.

ADD REPLY • link 7.9 years ago by Brian Bushnell 20k

0

Entering edit mode

Hi Brian, many thanks for leaving this comment. Regarding Java, it sounds like you're talking about the Java Compiler API which AFAIK is a slightly convoluted way of calling the "javac" program. I don't really see how that's different, except for the convenience aspect of it, from using a C program to generate a text file which is then fed into the C compiler to generate a shared library which is then loaded. I'd be curious to know how an operation like tetration (https://en.wikipedia.org/wiki/Tetration) could be done in Java and if it can be done without generating an intermediate file. More importantly, can it be done from a deployed program (JAR file) when only the JRE is on the hard disk (as opposed to the JDK)? In Common Lisp, it would just be:

  (defun tetration-code (var n)
    (if (zerop n)
      1
      `(expt ,var ,(tetration-code var (1- n)))))

  (compile 'my-tetration `(lambda (x) ,(tetration-code 'x 3)))

Regarding your other point about developing performance-critical high-throughput sequencing bioinformatics tools, I think the tide is definitely turning and one of the primary goals of this paper is to draw attention to this. There's a number of citations to a few recent software in the bibliography (e.g., elPrep, cljam, etc.). Thanks again for leaving a comment.

ADD REPLY • link 7.9 years ago by Bohdan Khomtchouk ▴ 350

1

Entering edit mode

Hi Bhodan,

I think you're misunderstanding how Java compilation works, or at least, what I was trying to say. "Javac" is sort of a compiler, and sort of not; it's more like half of a compiler. It does not generate machine code, but byte code; without further compilation, Java can only be interpreted. That's one reason why javac is so fast compared to, say, compiling C or golang code - it's not completely compiling source code to a binary executable, or doing most common optimizations. The machine code is generated by the Java virtual machine at runtime. While running, performance-critical areas of code are recompiled with further optimizations (typically after that code path has been traversed X number of times), which allows dynamic optimization based on the input. Like garbage collection, this is all automatic; you don't manually add code directing when and where to compile things as in your example, except in some rare cases like regex matching. See HotSpot and JIT for further explanation.

ADD REPLY • link 7.9 years ago by Brian Bushnell 20k