Question

Sample Identifier Standard For Lims Systems ?

7

Entering edit mode

14.1 years ago

Roman Valls Guimerà ▴ 620

There's a seemingly trivial question that has been floating around in our lab for quite some time now, and it relates to proper LIMS sample tracking. Namely:

Is there any consensus within the LIMS/bioinformatics community regarding the identification and phyisical labeling of biological samples ?

The requirements may appear obvious but need mentioning:

1) Unique identifiers: We don't want any ID collisions downstream.

2) Physical barcoding (if any) should fit on all tube shapes.

3) Informative labels vs simplicity:

     3.1) Is it advisable or useful to add metadata on the Unique identifier such as RNA/DNA/other sample ?
     3.2) Or just a straight random but human-readable string ?
     3.3) Which is the optimal length for those ?

4) Easiest input for lab people: administrative burden such as relabeling or misreadings should be reduced to the minimum expression.

Alternatives we've been looking for are:

http://www.bradyid.com/bradyid/scpv/Labels,-Markers-and-Tapes~Laboratory-Labels.html

Or, against the non-adhoc mantras (which I agree to the fullest):

http://biostar.stackexchange.com/questions/4622/how-do-we-discourage-ad-hoc-bioinformatic-analyses

A DIY labeling schema using standard UPC/EAN barcoding systems:

http://www.barcodesinc.com/generator/index.php

So what's the chosen system on your facility ? Any regrets or suggestions ?

pipeline barcode • 6.6k views

ADD COMMENT • link updated 11.5 years ago by marksimon112 • 0 • written 14.1 years ago by Roman Valls Guimerà ▴ 620

score 3 · Answer 1 · 2011-04-14

3

Entering edit mode

14.1 years ago

Jeremy Leipzig 23k

Barcodes are a standard fixture in most high-throughput lab automation schemes such as those found in large pharmaceutical and agricultural companies.

2D barcodes can store a lot of information, as your avatar indicates you are probably already aware of that.

I would start by choosing a networked barcode printer that offers the most flexible and open methods of printing - FTP, REST, etc.., then build your LIMS using identifiers that are human readable, extensible, and scalable.

When I worked in industry we built a small LIMS using the Grails framework, a Groovy-based web application framework like Rails but which is able to leverage Java JAR files more easily. One of those JARs ran the ftp-based barcode printer, so that was a plus. We were also able to integrate mobile devices easily, as both iPod and Android devices now have a range of dedicated barcode reader attachments. I would advise against relying on the built-in cameras in those devices for reading barcodes.

ADD COMMENT • link 14.1 years ago by Jeremy Leipzig 23k

0

Entering edit mode

Thanks indeed for your answer Jeremy, very interesting :)

What's actually your take on a "human readable, extensible, and scalable." label/string ? How did they actually look like on your system ?

ADD REPLY • link 14.1 years ago by Roman Valls Guimerà ▴ 620

0

Entering edit mode

I mean i can be as simple as experiment.condition.sample.individual.run or whatever your workflow entails. The point is to try to foresee not painting yourself into a corner, such that if you need to run a sample twice or divide a sample into two conditions you are able to produce a barcode within your system.

ADD REPLY • link 14.1 years ago by Jeremy Leipzig 23k

0

Entering edit mode

in our system i believe it was experiment.condition.plate.flat.run

ADD REPLY • link 14.1 years ago by Jeremy Leipzig 23k

score 1 · Answer 2 · 2011-04-24

1

Entering edit mode

14.0 years ago

Roman Valls Guimerà ▴ 620

Just found by chance a system used by the NERC BioLinux people (Tim Booth):

http://nebc.nerc.ac.uk/tools/handlebar

Written in perl back in 2008 (latest release). Seems quite comprehensive and with good docs:

http://nebc.nerc.ac.uk/handlebar/user_guide.pdf

ADD COMMENT • link 14.0 years ago by Roman Valls Guimerà ▴ 620

score 1 · Answer 3 · 2013-07-27

I recommend the Unified Informatic Identifier (UINI) scheme. I designed it to:

Allow universal tracking of samples and organisms through the wet lab, to the products of in silico analysis
Allow independent teams of scientists to merge and share data with databanks
Provide for identification of relatively large numbers of items
Provide for efficient computation and transcoding by machines
Assure low cost of adoption
Assure low cost of administration
Remain viable as an identifier scheme for a period of decades
Eliminate idiosyncratic human affordances which will soon become immaterial due to automation

The UINI scheme is free. It requires no centralized administration. However, it is not widely used, and would surely benefit from peer review, public ridicule, or whatever.

There is currently no standard barcode format for UINIs, perhaps you can suggest one, or offer feedback to the community if you adopt UINIs in your LIMS.

The UINI scheme is more suited to computers than humans. I love humans, but they usually dislike manual data entry; they do it slowly and inaccurately. A barcode scanner is a better solution. We have already seen the advent of robotic wet labs and robotic sample libraries. For these reasons, and more, the UINI scheme favors automated processing over manual human processing. If you are in charge of an enterprise, it will probably be cheaper to deploy barcode scanners in your wet labs, than hire legions of computer programmers to cope with quirky, inward-looking identifier schemes. Barcode scanners that emulate computer keyboards are now available and require little in the way of custom software integration. Depending on your specific requirements, barcodes can sometimes be printed on standard printers with free or inexpensive software.

The UINI scheme does NOT encode data within the identifier itself. A UINI is a sequence of numbers and letters. For reasons too numerous to list here, encoding data within identifiers is generally a terrible idea. Encoding data within identifiers is suitable for insular groups of humans who are working manually, so the practice refuses to die and is widely misapplied everywhere else. The UINI scheme avoids this practice, because it would constitute malpractice in the broader contexts to which the UINI scheme pertains.

If you operate a LIMS, pipeline, or databank, you can standardize on the UINI scheme to refer to all classes of data. Although, whatever scheme you adopt, there will be issues related to historical data and identifier mapping. This issue is not unique to the UINI scheme. If you've been operating for any length of time, you may have identifier mapping solutions in place, already.

You can download the UINI specification as a PDF, online. The first public UINI specification, dated December 2012, was 36 pages, and was identified by 59D25EAD-3C7C-4871-B9B3-76D559F0DC22. If the hyperlink does not work, you may use your computer's software clipboard to cut-and-paste the document UUID into a search engine. The hyperlink takes you to an index. That index should be updated in the future to include revisions to the UINI specification. Use the hyperlink if it works for you. Persons who construct information systems were the intended audience for the specification, so it may or may not be comprehensible.

If you identify a problem with the UINI scheme, please advertise your complaint so that the problem can be remedied or others will know to avoid it.

In a sufficiently narrow and well-regulated context, you may be able to get away with only printing the last several digits of a UINI on your sample; which could be sufficient for distinction of samples within that context. However, I strongly recommend a barcode-based solution that encodes the entire UINI.

Thanks to barcode scanners, hyperlinks, and software clipboards, nobody should have to manually type a UINI; so that issue should be moot.

I wouldn't normally have responded to this two year old discussion thread, but this question re-appeared in the Biostar feed about seven days ago, so perhaps the question is still relevant. I also don't like to engage in self-promotion, but the UINI scheme is free, and I don't want to see another quirky identifier scheme proliferate. I tried to invent an identifier scheme that will be broadly workable.

Identification of scientific samples in LIMS was a specific concern when I conceived the UINI, because I was interested in being able to trace genetic and proteomic information back to the specific sample or individual animal from which it was obtained. Hopefully the UINI scheme will prove helpful to you or your colleagues.