The US FDA's MicroArray Quality Control (MAQC) consortium's latest study (HTML or PDF) suggests that human error in handling DNA microarray data analysis software could delay the technology's wider adoption in the clinical research.
Do you agree with this analysis?
How common are procedural failures in microarray based research relative to non-bioinfomatic based research?
More importantly, how researchers control human error, report errors, and define the risks of errors?
The take-home messages of the paper are sound, and follow what I would have expected:
Some problems are easy (Guess the sex!), while others are hard.
Experienced teams do better than novice teams
Most standard algorithms are equally good at finding clear signals.
It's very easy to screw up the initial bookkeeping and hose your whole project
Cross-validate and don't pick arbitrary "training" and "test" sets.
Almost no one actually does reproducible research. The only way to publish a reproducible result is to hand someone your raw data and a turn-key script that reproduces your classifier.
I think protocol errors (off-by-one, etc) are common in bioinformatic analysis, but my own opinion is that this is true of any high-throughput method in inexperienced hands (Mass Spec, FACS, etc). There are numerous ways to reduce these protocol errors, and the reference list for that paper has many good suggestions. Anyone can make a mistake, but the more analysis you do, and the more systematic you are about it, the better you get. I think the key is being relentlessly systematic and working so that you can always reproduce the whole analysis. Develop recipes for analytical problems so that the mundane business is routine. Do your work so that you can turn a key and re-do your whole analysis from raw data on command. Assume your dataset has batch effects. Assume that Keith Baggerly will be the next person to look at your methods section.
My own methods (worked out over years of struggling with this issue) are to perform my analytical work in a way that leaves as completely reproducible an artifact as I can generate for me or another observer. In practice, this means that code used to normalize datasets is stored in a standard location next to those data, and all code required to generate publishable results can be extracted from an online lab notebook. I can pull out the R code from my notebook, run it unchanged, and generate Figure 1, Figure 2, etc. I still make mistakes, but they can be traced.
@David Quigley: I agree that "this is true of any high-throughput method in inexperienced hands" -- which is the problem, based on my own anecdotal observations, a significant number of scientist do not have a background in high-throughput systems, and have grown into the role. Further, based on what I'm seeing so far, it's not the norm to have turn-key systems end-to-end, or controls in place to confirm the effects of any changes made to the system. Are your systems turn-key end-to-end, and do you have controls that test the effects of any changes made to the output of the system?
Which analyses 5 case studies and finds a large number of errors, most of which relate to "simple" tasks such as labeling samples correctly.
As to your questions:
The MAQC study looks like a very good analysis (and much what you'd expect)
Procedural failures are quite common in all research, both computational and lab-based
The best error control is to have multiple eyes examine your work; unfortunately, many researchers work in near-isolation
Not sure I'd agree that this will delay clinical adoption of microarray technology. There are already many diagnostic labs that use microarrays and you'd imagine that to be certified, they require stringent QC practices. My (personal, subjective) feeling is that microarrays are on the way out anyway and will be largely replaced by deep sequencing methods in the next 5-10 years.
I'm looking forward to a sequencing-based world. However, deep sequencing methods will still be susceptible to all of the fundamentally hard parts of gene expression analysis. There's a reasonable argument to be made that while more informative, sequencing will also provide many new opportunities for error and unintentional bias.
Hmm. So much time removing batch effects from experiments. So many times trying to work out which samples have been mislabelled by the company that ran the arrays..
I don't think it's going to delay adoption much though. I understand that the clinical environment is somewhat different to the research environment, but weight of evidence wins out in the end. I endorse Neil's answer wholeheartedly with regards to the 'end of days' for arrays.
My own methods (worked out over years of struggling with this issue) are to perform my analytical work in a way that leaves as completely reproducible an artifact as I can generate for me or another observer. In practice, this means that code used to normalize datasets is stored in a standard location next to those data, and all code required to generate publishable results can be extracted from an online lab notebook. I can pull out the R code from my notebook, run it unchanged, and generate Figure 1, Figure 2, etc. I still make mistakes, but they can be traced.
@David Quigley: I agree that "this is true of any high-throughput method in inexperienced hands" -- which is the problem, based on my own anecdotal observations, a significant number of scientist do not have a background in high-throughput systems, and have grown into the role. Further, based on what I'm seeing so far, it's not the norm to have turn-key systems end-to-end, or controls in place to confirm the effects of any changes made to the system. Are your systems turn-key end-to-end, and do you have controls that test the effects of any changes made to the output of the system?