I know that Java programs work faster than python ones, so why there are a lot of programs written in python? is it because developing in python is faster or what?
or because there are some available libraries for python which are not available for a language like java?
I'm asking this because I'm trying to write a tool which involves BLAST and I care a lot about performance and I'm competent in both Java and python.
SO which one I should use to have a better performance?
I will give a subjective list of advantages to using python:
Learning python is quicker than learning Java. What I mean by this is if I have a class of students with limited programming experience and I attempt to teach 50% of them Java and 50% of them python, I expect those that use python to be writing more complex programs after a month.
It's faster to write code in python, than Java.
It's easier to install and run code written in python. I find that I am always having to tangle with my version of JAVA. On my mac for instance, the security issues with Java mean that it's always being broken by security patches somehow.
Although Java is faster, some of the slowness of python can be overcome with tricks like cython.
I'd also add that there are also far more Bioinformatics related libraries available for Python compared to Java. BioPython is ok, and AFAIK much better developed and extensive compared to BioJava. And, at least on the genomics side of things, you have pybedtools, PyVCF, and a whole host of other libraries meant to do common tasks well. Not to mention all Python's built in libraries like the csv module that make working with standard bioinformatics and genomics filetypes and datasets very straightforward.
One of the things I like a lot about python is the pythonic approach is to have one best way of doing things. I think this often extends to a tendency for the python community to concentrate on making a few good packages; often for any given application, the choice is relatively clear and usually the package is well-supported and actively being developed. eg. numpy, scipy, matplotlib, pandas, rpy/rpy2
Definitely. Also easy_install, pip, and virtualenv make installation of most packages and the construction of specific application and development environments very straightforward.
Most of the so-called bioinfo programmes are just simple wrappers running some other tools (usually quite computation-heavy) and processing their results. In such cases, it doesn't really matter which language is faster, but which is more handy to work with. For that, my choice is python.
This said I would go for java if there is need for GUI or multi-platform support.
My guess is that BLAST will take most of the running time in your case and "gluing it together" with python or JAVA won't make a huge different, go with what is more convenient for you. I think you should check how easy will it be to parse the data and BLAST results in either of the languages (should be pretty easy with python/Biopython, I don't know about JAVA).
You should probably consider future or multiOS compatibility if this tool should be used in the future (in this case I think JAVA is better).
That's my opinion at least, hope I helped.
Without getting into a ridiculous argument over which language is better. In most situations, practically, development in python (or any interpreted language) is just faster.
It also depends on what kind of tool you are developing. If you are writing a simple file converter, the speed difference will probably be negligible. If you are writing a computationally heavy piece of software, why not just do it in C.
And about the tool you are trying to develop that involves BLAST. I recommend identifying where the bottleneck will be first. Most likely it will be the blast process. If running blast takes 10 minutes and your python/java portion takes 10/20 seconds, does it really matter?
if you need to parse blast with java, just use xjc with the -dtd option and the DTD for blast.
I keep asking myself this question ... sigh