Question

Are Procedural Failures In Microarray Based Research A Real Issue?

6

Entering edit mode

14.8 years ago

Blunders ★ 1.1k

The US FDA's MicroArray Quality Control (MAQC) consortium's latest study (HTML or PDF) suggests that human error in handling DNA microarray data analysis software could delay the technology's wider adoption in the clinical research.

Do you agree with this analysis?
How common are procedural failures in microarray based research relative to non-bioinfomatic based research?
More importantly, how researchers control human error, report errors, and define the risks of errors?

quality microarray data analysis • 3.0k views

ADD COMMENT • link updated 14.8 years ago by User 59 13k • written 14.8 years ago by Blunders ★ 1.1k

score 8 · Answer 1 · 2010-11-22

The take-home messages of the paper are sound, and follow what I would have expected:

Some problems are easy (Guess the sex!), while others are hard.
Experienced teams do better than novice teams
Most standard algorithms are equally good at finding clear signals.
It's very easy to screw up the initial bookkeeping and hose your whole project
Cross-validate and don't pick arbitrary "training" and "test" sets.
Almost no one actually does reproducible research. The only way to publish a reproducible result is to hand someone your raw data and a turn-key script that reproduces your classifier.

I think protocol errors (off-by-one, etc) are common in bioinformatic analysis, but my own opinion is that this is true of any high-throughput method in inexperienced hands (Mass Spec, FACS, etc). There are numerous ways to reduce these protocol errors, and the reference list for that paper has many good suggestions. Anyone can make a mistake, but the more analysis you do, and the more systematic you are about it, the better you get. I think the key is being relentlessly systematic and working so that you can always reproduce the whole analysis. Develop recipes for analytical problems so that the mundane business is routine. Do your work so that you can turn a key and re-do your whole analysis from raw data on command. Assume your dataset has batch effects. Assume that Keith Baggerly will be the next person to look at your methods section.

score 7 · Answer 2 · 2010-11-22

7

Entering edit mode

14.8 years ago

Neilfws 49k

If you want a real microarray horror show, you should read:

Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology

Which analyses 5 case studies and finds a large number of errors, most of which relate to "simple" tasks such as labeling samples correctly.

As to your questions:

The MAQC study looks like a very good analysis (and much what you'd expect)
Procedural failures are quite common in all research, both computational and lab-based
The best error control is to have multiple eyes examine your work; unfortunately, many researchers work in near-isolation

Not sure I'd agree that this will delay clinical adoption of microarray technology. There are already many diagnostic labs that use microarrays and you'd imagine that to be certified, they require stringent QC practices. My (personal, subjective) feeling is that microarrays are on the way out anyway and will be largely replaced by deep sequencing methods in the next 5-10 years.

ADD COMMENT • link 14.8 years ago by Neilfws 49k

0

Entering edit mode

I'm looking forward to a sequencing-based world. However, deep sequencing methods will still be susceptible to all of the fundamentally hard parts of gene expression analysis. There's a reasonable argument to be made that while more informative, sequencing will also provide many new opportunities for error and unintentional bias.

ADD REPLY • link 14.8 years ago by David Quigley 11k

0

Entering edit mode

@neilfws: +1 Agree with all your points, and thanks for additional resource on the subject.

ADD REPLY • link 14.8 years ago by Blunders ★ 1.1k

0

Entering edit mode

Completely agree - there'll always be error and bias. But hopefully less noise, since a sequence should either be present or not.

ADD REPLY • link 14.8 years ago by Neilfws 49k

score 2 · Answer 3 · 2010-11-23

Hmm. So much time removing batch effects from experiments. So many times trying to work out which samples have been mislabelled by the company that ran the arrays..

I don't think it's going to delay adoption much though. I understand that the clinical environment is somewhat different to the research environment, but weight of evidence wins out in the end. I endorse Neil's answer wholeheartedly with regards to the 'end of days' for arrays.