What are the most important public datasets in biology?
1
2
Entering edit mode
8.6 years ago

I have been asked to compile a list of the top 10 most valuable public biological datasets for potential inclusion in a data commons-like environment. The goal is to have not just large datasets, but highly valuable ones. I am interested in genomics, of course, but also in clinical datasets, well-annotated registries, etc. Any suggestions are welcome and appreciated!

public data crowdsourcing • 2.1k views
ADD COMMENT
1
Entering edit mode
8.6 years ago
SES 8.6k

A strong case could be made for the HeLa cell line and sequence data as being the most important in biomedical research (there is a working group page with some background). An article written by Francis Collins a few years ago estimated there were over 74,000 publications that have referenced this data.

The Global Ocean Sampling Expedition data (and publication) was the first large-scale application of NGS sequencing, particularly environmental sequencing, and this study contributed a great deal to what we know about marine biodiversity.

ADD COMMENT
0
Entering edit mode

Thanks. These contributions are exactly what I had in mind.

ADD REPLY

Login before adding your answer.

Traffic: 1550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6