This is a debate I had the other day..
If a budding bioinformatician had to choose to study only one of these topics
Database design and development vs. Algorithms and Data structures.
Which would you recommend.
This is a debate I had the other day..
If a budding bioinformatician had to choose to study only one of these topics
Database design and development vs. Algorithms and Data structures.
Which would you recommend.
A budding bioinformatician would never have to choose between those topics. They are both quite fundamental to the subject.
To my mind, a bioinformatician is someone who sees the world (and that includes their own education) as data and is able to absorb and process information from many different areas.
EDIT: my clumsy second sentence is expressed, much better, by Sean Eddy in “Antedisciplinary” Science. I encourage all budding bioinformaticians to read it.
I have seen bad database designs, bad indexing and bad queries, all because people do not really understand how a database works. To get the best out of databases, we must understand data structures first. Once we understand algorithms and data structures, learning to use or even design a database is trivial (I spent little time on learning SQL, but I believe I am above the average level).
In addition, databases are fine for less than a million records. One needs to be fairly good to make it work efficiently for millions of records (when you really understand how databases work). For billions of records, specialized formats/algorithms can be far more efficient than databases (you cannot design an efficient database even if you are the best).
EDIT: By "databases" here, I mean the traditional SQL databases.
I certainly agree with less than a million records issue. Although there are many high end database applications around the world that deal with much larger datasets (i.e. Google) the resources necessary to make these ultra large database systems efficient are so big it doesn't make sense to utilize them in scientific research.
In my opinion everything in bioinformatics is about speed (or you trade it for space). Databases are fairly optimized and well established and thus even unskilled persons achieve fast queries. Just remember the basic guidelines as atomization, etc. In contrast, when choosing the wrong data structure(s) for processing you won't be able to handle your data.
Thus: +1 for algorithms and data structures
I think any answer that would choose one over the other would be fundamentally wrong. You need algorithms to work on the data stored in databases. Or your algorithms will help create the data to be stored in these databases. It is very rare that an algorithm will work on a data item and the output is not stored in some form of data structured that is stored, accessed by others etc. Image processing might be the closest example to a case where databases are not as important but even than most modern systems store their images in databases. In the end databases and algorithms are tools what is more important is to be able to think scientifically and being able to generate ideas and hypothesis.
That's exactly my point. You should advise the student to work on both but if you want an absolute answer tell him to work on his math, statistics, biology and becoming an expert in at least two programming languages one being perl/python and the other C/Java. Everyone has their own favorite language and which one you choose is simply a matter of personal preference for the most part.
You don't need to know how a database is designed to use it, you only need to know what is the meaning of the data annotated and how to get it. So, learning how to find, judge, use a database is more important than learning how to design it.
Algorithms and data structures are more useful for a bioinformatician, in the sense that you can analyze a set of sequences or a dataset without using information from other databases. However you can be a good programmer and bioinformatician without having a deep understanding on how data structures are implemented: just knowing the difference between an array and a list is sufficent, and you can learn that in a few days.
I am very much of the opinion that you shouldn't close yourself off into predefined artificial boxes! Bioinformatics is a multi-disciplinary field, that takes into account many of the strengths of various sub-disciplines within computer science (as well as biology and chemistry).
I understand that you are trying to ascertain what people believe to be the most important, but that is completely subjective. The truth is that these topics are like cogs in a working machine. They all need to interact smoothly or it all falls apart, regardless of your opinion of the topic.
A great bioinformatician is one that knows their tools and their "machine", intimately.
Already good analysis here, but I want to point out something too. Being the Bioinformatician our main goal is to understand the the Biodata which can be in any form. The first step is focused on Biodata and analysis of the result which means it has high magnitude on the Algorithm and data strucutre. We dont need to trouble about database and it's schema until and unless we are going to design our own. But algorithm is applied on different form of data and it's structure is relevant.
So it depend on which level you are and what you are going to do in your field. But, in conclusion it's always better to understand Algorithm and data structure before directly jumping to database.
I will recommend to study the field you like the most. Both are interesting and useful.
I didn't like to study databases. So I know how it works, but I don't like to create a database, work with it, etc... I also know that data and their storage are the basis of bioinformatic analysis. But if someone else can manage it, I am happy.
I like algorithms, so I study algorithms. Even if I see more jobs for databases and web services.
And moreover, if you choose one, you will cross the other during your study so, it does not matter.
Short answer: for a "budding bioinformatician" both are equally important.
I don't agree. For "bioinformatics" both are equally important but for a "bioinformatician" not. You can choose which one you give interest.
It's like "What is most important for a bioinformatician to study : biology or computer science ?" Everybody will answer both but in fact, you have a biology background or a computer science background. And so you have preferences.
This does not mean one is worse than the other, you just choose on which you focus.
yes I agree that. But Why did I choose Bioinformatics ? Because I choose computer science and then I discovered bioinformatics.
I think the same goes for fields of study. You begin with something you like and it guides you to another field. So, finaly (and after your answer), I think it does not care which one you choose. You will obviously cross databases and algorithms if you are a bioinformatician.
The question is not one or the other. Both are important and you need to be exposed to both. At some point you can make a choice which aspect you prefer. Both have software design challenges, both have biological challenges. The way I see it, one is infrastructure (databases) and the other is something that leverages that infrastructure to develop and evaluate hypotheses. They work together and without one neither is of much use. Should you be good at both? Not necessarily, not everyone can be. Should you be really good and have an understanding of the other? Absolutely. Study both and find the one that gets you more excited.
I say this coming from the algo side of life, but live in awe of people who can build data platforms
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I like the answer, but its on a higher level of abstraction - and not something I could really have used back in college. I can appreciate it being out of college for a couple of years now.
+1 for pretty much exactly what I was thinking! ;)
@Pratek: You could use it in college: you have to study both, only when you know what both topics are about can you decide which aspect is relevant to the science you are doing at the moment.