In the past, I've been stung by some pretty bad software that was doing silly things that no one in their right mind would have thought would be OK. Unfortunately, the "result" of this tool looked valid (or rather, there was nothing to compare it to), and so until I read the source code, I didn't actually know anything was wrong.
Since that day, I've been a pretty stubborn Bioinformatician. Except for the very complicated and public tools like the mappers or samtools, I don't run software unless I can read the code. Unfortunately, the only bioinformatic-relevant programming language I know is Python, so that somewhat limits how much code I can realistically use. The one big exception is Picard, which is all Java, simply because I trust it's authors. So yes I am also a hypocrite. I would also consider being able to run awk and other unix tools in a pipe to be programming in a wider sense - however its a bit of a grey area since you don't really know what sort or uniq are doing (re. only looking at the first 8kb of data per line for uniq on OSX).
So that's how I behave in practice. However, when people ask me "do you need to be a programmer to do Bioinformatics?" I say "no way - writing tools is not something you should have to do!".
Recently however, I've been doing a lot of mapping - so I've had some time on my hands to really think about this. The result is a few rhetorical questions, and I would be interested to hear people's thoughts :)
- If you publish findings using a tool that gave incorrect results, is that your fault? How much of the blame would you accept?
- If you ask (or have been asked) to use a new tool for a certain analysis, will you (or your boss) set aside time in the plan for reading the code?
- Do you ever use closed-source tools in any step of your workflow? (that have an impact, not, say Oracle Database or a text editor)
- Do you think you need to be a programmer to be a Bioinformatician?
The last question is not "to get results" but to truly be a Bioinformatician. I am frequently told by my boss that im not a bioinformatician, I'm a PhD student, so here when I use the term Bioinformatician I really do mean someone who knows what they're doing. Not someone who can simply get some sort of result, but someone who gets the best result, and knows why its the best.
Thank you so much for your time :)
@John: I will add this as a comment since I don't think it qualifies as a full answer.
First of all bioinformatics tools are generating hypotheses. In most cases a researcher should (needs to) independently verify those hypotheses (preferably by experimental means). So even if one was to get incorrect results from an informatics tool, the hope is that error would be caught during independent verification (if Murphy's law applies then all bets are off but then you may not be the first person to have had that happen).
There needs to be an "applied" bioinformatician classification that majority probably fall in. We are not programmers and can't write standalone packages but at the same time have enough knowledge to make changes as needed to tweak output/make sense of how to use code that gets put out. Ultimately the domain knowledge experts that you are working with need to make sense of/check validity of any results you (or the software being used) produced.
You make a very good point about secondary validation nullifying any systematic errors in the bioinformatic steps. In practice I wish this was done more often - but perhaps it's not as easy as it sounds. If there is a totally different biological assay (which requires totally different tools to analyse) to get the same information, then yes. But without that secondary assay, your options would be to use different software that aims to do the same thing - which may not be compatible by design. For example, CuffDiff for some samples, DESeq for others. Now they cluster by analysis software, not biology - so which one is right? Without knowledge of how the program works, it's difficult to weigh up their pros and cons. But at least you now know there are differences! :)