Forum:What programming language do you use the most in your work? Which one do you like the best?
10
3
Entering edit mode
3.9 years ago
biotrekker ▴ 110

Which language do you use the most in your work? Which language do you prefer? With regards to programming language, which direction do you see the bioinformatics field going?

Thanks!

Programming Python R Scripting • 4.9k views
ADD COMMENT
4
Entering edit mode

Programming languages are tools. Use the tools that get the job done. As a minimum, you need to be familiar with a shell (e.g. bash) and comfortable with a general purpose high-level scripting language (e.g. perl, python). Now, depending on your corner of the woods, you may or may not need something else (e.g. if you need to write high-performance software). The Julia programming language would be a good one for scientific computing in general. However for bioinformatics and other domains, it is where python was 15+ years ago relative to perl: a few devoted people busily rewriting libraries available in other languages but still not enough coverage to win over the masses.

ADD REPLY
11
Entering edit mode
3.9 years ago

politically incorrect answer: PERL !! (and this mostly to annoy my peers :) )

but I must say the field of bioinformatics is definitely NOT evolving in the direction of PERL (it has kept the field alive for over a decade but it's time has come, to be honest)

what i use most on a daily basis is linux/bash scripting , not a real programming language but still damn useful for doing everyday bioinfo tasks

ADD COMMENT
4
Entering edit mode

Most under-appreciated benefit: I can take a perl script from 10 years ago and it will work 95% of the time. Which other mainstream bioinformatic language can you say that about?

ADD REPLY
5
Entering edit mode
3.9 years ago
steve ★ 3.5k

Rust and Go are the best languages I have tried. But Rust proved to be too low-level for daily use, and its hard to find excuses for using Go when Python already has everything I need. However as the complexity of your production applications grow, the advantages of languages like Rust and Go start to outweigh the "convenience" of Python. Being able to compile and deploy a single static binary to run your apps, and not even needing a container for it, is huge. With interpreted languages like R and Python you pretty much need to ship an entire operating system for them to work, especially when using a large amount of 3rd party libraries (and dont get me started on library management for these two...).

The strong static typing and more verbose function and class signatures in Rust & Go makes the source code so much easier to understand. I also really appreciate how Rust lets you embed your unit tests directly into your program's source code in the same file.

ADD COMMENT
0
Entering edit mode

Between Rust and Go, which one would you recommend as being the more "Swiss army knife-y" and easier to learn?

ADD REPLY
0
Entering edit mode

if "Swiss Army Knife" and "easy to learn" are the criteria, then just stick with Python. Between the two, Go is probably the most logical next step. Though both have aspects that can seem bizarre and confusing when coming from a Python background. Simple things like Python's csv.DictReader, which is the backbone of a lot of my bioinformatics scripting, don't always have a direct easy equivalent in either language, making them harder to use as a "daily driver". One of the attractive features for me has been how Go obviates the needs for 3rd party libraries such as Python's requests and flask for basic http communication, and its Go routines make concurrency / parallel a core feature of the language much better than Python's multithreading.

ADD REPLY
4
Entering edit mode
3.9 years ago
ATpoint 86k

Some inspirations:

Rust or C++, what to learn after Go for high-performance bioinformatics tools?

Bioinformatics: Which language should I learn

Best Language For Bioinformatics

Which Are The Best Programming Languages For A Bioinformatician?

Programming language use distribution from recent programs / articles

What I use/like most: R. Why? Because it is the only language I know well enough to actually do some productive work with it (I do not count bash as a language) and as I do both the wetlab work and subsequent analysis of the data I generate I simply do not have the time to work myself into another language even though I would really like to learn a 2nd one, something like Rust which seems to be what the cool kids learn these days. For sure not Perl. But at the end of the day the biology/science has to be done and for this I rather spend time keeping up with the ever increasing amount of literature/new papers published rather than learning a new language. That would be different of course if I was a software developer rather than analyst. That having said, I catch myself far too often coding stupid little things rather than reading papers because coding is fun, and reading papers is hard work, especially when you read several in a row to find out that none has a real biological message. That unfortunately gets more and more common with the ever increasing amount of high-throughput data that are rather descriptive in far too many papers.

ADD COMMENT
2
Entering edit mode

The principle behind this is absolutely correct if you ask me. The correct language is the one your are most comfortable and productive in. For me that's probably python (although R is not so far behind). In most cases you'll gain more performance out of picking the correct approach rather than the correct language. And if you are family with a language, you are less likely to get it wrong. Consider - if a python script might do in 20 minutes what a C++ script would take 1 minute to do, but it will take me 20 minutes to write the python script, but an hour to do the C++, I'm better off doing it in python, unless its something I am going to have run frequently.

There are situation where the choice of language is important, but to be honest, if you are in that situation, you probably know and don't need us to tell you.

As for what is popular/the future - python and R are the most common languages in bioinformatics. Nothing else seems to be "the future". Rust seems to be taking over from C++ where performance is critical, but that has always been in a minority of situations.

ADD REPLY
1
Entering edit mode

Nothing else seems to be "the future".

I wouldn't rule out Julia. Years ago, perl was about all there was, only a few language enthusiasts were working with python. Now python dominates but I can see Julia having the same trajectory and Julia enthusiasts seem to believe it will eventually replace both python and R (and Matlab).

ADD REPLY
0
Entering edit mode

5 or 6 years ago I thought - that Julia looks interesting, I wonder if that will go anywhere, looks like it might combine the best of Python and R. But it seems to have faded from view. Its not just that "its only a few language enthusiasts", but that it doesn't seem to be increasing.

ADD REPLY
3
Entering edit mode
3.9 years ago
4galaxy77 2.9k

Coming from a statistical background, I almost entirely use R when I can get away with things. It's capacity for manipulating data into any format is unrivalled as far as I am aware. This is mostly to do with the incredible tidyverse libraries which make doing all kinds of complex things extremely clean and easy. I also enjoy using the Rcpp functionality as well, which makes it very easy to write snippets of high performance C++ code into your R base.

Some coding snobs/purists will tell you that R isn't a real programming language etc. Whether or not this is true is mostly irrelevant to my view that R tends to be excpeptionally good at most of the things it's designed for - primarily performing complex data manipulation, statistical analysis and general analysis of e.g. data in some kind of text format.

That said, I cringe when I see people analysing massive genomic datasets in R. It is a terrible memory hog and loading entire vcfs into memory is just a non-starter for me.

When I absolutely have to, I will write something in C from scratch. But fortunately this doesn't happen too frequently any more.

ADD COMMENT
0
Entering edit mode

the fact that tidyverse isnt in the R standard library (yet?) is a huge hindrance for R, IMO. I have spent years actively avoiding it simply because being forced to install and manage 3rd party libraries for every project is such an excruciatingly painful ordeal. In the end I usually find myself going back to Python's csv library for the exact reason you mention; avoiding loading the entire dataset into memory. The data manipulation with csv is not as elegant as in R, but if you can keep your processing such that you're only ever working on one line at a time, then the resource requirements are almost non-existent regardless of the size of the dataset.

ADD REPLY
1
Entering edit mode

tidyverse is bloated and doesn't need to be part of R's standard library at all. The most widely used entity - the pipe operator from the package magrittr, is now being implemented in R as the |> operator.

ADD REPLY
2
Entering edit mode
3.9 years ago

Which language do you use the most in your work?

java / bash

Which language do you prefer?

java

With regards to programming language, which direction do you see the bioinformatics field going?

Rust

ADD COMMENT
2
Entering edit mode
3.9 years ago
shelkmike ★ 1.4k

I use Python the most, but I love Perl more. Unfortunately, Perl went out of fashion, thus new bioinformatic libraries are usually created for Python and this is the reason I use it.
Regarding the future direction: I think that bioinformaticians will continue to use Python. It is one of the most popular languages in the world and its popularity continues to increase as of February 2021.

ADD COMMENT
2
Entering edit mode
3.9 years ago
the_dummy ▴ 40

It is not set in stone, but academia is moving to Python in many fields.

The reason of this movement is mainly: 1) Python is easy to learn and use for a person who has no experience in programming. 2) Again, it is easy to get and use libraries created by users, so it supports community to share their knowledge easily.

Also, I would add R for statistical purposes.

And lastly, bash is an absolute must in my honest opinion.

edit: academy -> academia

ADD COMMENT
0
Entering edit mode

Which "academy" are you referring to?

ADD REPLY
0
Entering edit mode

In general. As I could see, non computer sciences -I'm not sure if this is a term but I mean fields like biology, chemistry, physics, statistics (of course R is the main language), I have seen some stuff with social sciences- adapts python if they need programming at some point.

Plus, machine learning applications for academic purposes use a library of python very often.

ADD REPLY
1
Entering edit mode
3.9 years ago
JC 13k

Which language do you use the most in your work?

Bash, Python, Perl, R (in that order)

Which language do you prefer?

Perl

With regards to programming language, which direction do you see the bioinformatics field going?

Rust, maybe Go

ADD COMMENT
1
Entering edit mode
3.9 years ago
Sej Modha 5.3k

Python3+ and Bash, but mostly Python3+ because of the amazing libraries/packages implemented and available for bioinformatics analyses. :)

ADD COMMENT
1
Entering edit mode
3.9 years ago
5heikki 11k

Bash, awk, R and Python

ADD COMMENT

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6