How Important Is The Long-Term Sustainability Of Data Resources?
3
8
Entering edit mode
14.1 years ago
Blunders ★ 1.1k

Source of the question is this quote: "A recent survey found that of more than 530 -omics databases in 33 European countries, more than 67 databases, or 12 percent, were not live or had not been updated since 2005, and the update status was unclear for another 78 databases, or nearly 15 percent of the total."

@TreyLathe: The article's subject is Data Management. To access the article requires a premium subscription, but... if you click this Google search I've linked to, and load Google's cache of each of the two pages of the article you can view it for free:

Bioinformatics Pros: Labs Should Include Data Management in All Project Plans, August 14, 2009, By Vivien Marx

database • 3.0k views
ADD COMMENT
12
Entering edit mode
14.1 years ago
Neilfws 49k

This issue was first raised, to my knowledge, in a 2004 article entitled "404 not found: the stability and persistence of URLs published in MEDLINE". See also articles citing that work for more recent commentary.

It is important of course, but we are not making progress. I think there are 2 main issues. First, long-term support requires long-term funding. Second, most online research resources are set up by academic research groups. Typically, this means that the database/web server runs on a machine in the corner of a lab and is the responsibility of the designated "computer guy". Once that person leaves the group, it's only a matter of time before the resource is lost. Another problem is that academic researchers care only about the publication (in, for example, the annual web server or database issue of Nucleic Acids Research); once that is achieved, maintenance is secondary.

I don't know how best to address these problems. Most journals stipulate that to be published, a resource should be available one year after publication. Perhaps they should actually enforce this by keeping a URL database, checking after one year and withdrawing the publication record if the resource has gone? Just a thought.

ADD COMMENT
0
Entering edit mode

@neilfws: +1 Thanks for posting! Re: "Most journals stipulate that to be published, a resource should be available one year after publication." -- Seems like the best way to enforce it would be by requiring resource(s) be on an independent hosting service, AND that users must pay to download the resource. Other option would be to require them to put the resource in the public domain, which would lead to copies spreading around based on the demand for the resource. Again, thanks for posting the links and your analysis on the subject. Selecting your post as the answer.

ADD REPLY
0
Entering edit mode

I agree with all this. You've nailed it on the head. In our own work we've found similar examples. Great resources that lose funding or the "computer guy" leaves. This happens to resources but also larger ones run into this issue. TAIR is an example of the latter. Not sure of the exact solution but there will need to be one soon.

@blunders where is that quote from? Wouldnlike to read the paper!

ADD REPLY
0
Entering edit mode

@TreyLathe: Article's subject is Data Management. To access the article requires a premium subscription, but... if you click this Google search I've linked to, and load Google's cache of each of the two pages of the article you can view it for free: Bioinformatics Pros: Labs Should Include Data Management in All Project Plans, August 14, 2009, By Vivien Marx google.com/#sclient=psy&q=inurl:bioinformatics-pros-labs-should-include-data-management-all-project-plans

ADD REPLY
0
Entering edit mode

@TreyLathe: Article's subject is Data Management. To access the article requires a premium subscription, but... if you click this Google search I've linked to, and load Google's cache of each of the two pages of the article you can view it for free: Bioinformatics Pros: Labs Should Include Data Management in All Project Plans, August 14, 2009, By Vivien Marx http://google.com/#sclient=psy&q=inurl:bioinformatics-pros-labs-should-include-data-management-all-project-plans

ADD REPLY
0
Entering edit mode

NAR at least gives you the option to publish an update to your database every two years, which is an incentive to keep developing it. Still, most resources are probably one-off projects by someone and not maintained... :-/

ADD REPLY
2
Entering edit mode
13.8 years ago

Of course, all this would be a non-issue if the data was Open from the start. Then every user that used the data, would duplicate it, keeping the data save, while the databases move or disappear.

ADD COMMENT
0
Entering edit mode

@Egon Willighagen: +1 Agree, thanks for sharing!

ADD REPLY
1
Entering edit mode
13.8 years ago
Michael 55k

Just saw this, so I agree with Neil. It is important, very important. So important actually that the European commission has raised the ELIXIR project. So, I think the progress in this direction is, that there has developed some consciouness for this project.

ADD COMMENT
0
Entering edit mode

@Michael Dondrup: +1 Thanks for sharing!

ADD REPLY

Login before adding your answer.

Traffic: 2581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6