Hello,
This question may seem rather straightforward, but I wanted to use several datasets from NCBI GEO Datasets and I have realized that many of them have been deposited (and updated) in previous years, but an attached citation with these datasets is still lacking.
As such:
1- Can I use this dataset and simply refer to the NCBI GEO Series ID in a future publication?
and generally,
2- How well would you trust the dataset(s) if there is no publication/results to read and cross-reference against? It seems a bit odd to find datasets that have been deposited in 2010 for example, but for which a publication is not apparent 3 years later. Or is it simply that the database is not updated as frequently as it should, and I need to probably track the authors for the citations?
Thoughts and advice from the bioinformatics community would be appreciated.
Thank you!
Deena
As an editorial comment, unpublished data are just as expensive to produce as published data, so making them available is valuable. We, as a community, should thank our colleagues (and not all of them are academics, and include industry, also) when they provide such data (assuming that the quality is such as to be useful)!
It depends of data itself. If there is no publication, I prefer not to use such data in serious research. If i use not my\my lab data, furthermore, not published data i feel like i am chiseler, lol. And i usually read description of the project in the database. Authors often mention that data is not published and ask for emailing them if someone want to use their dataset.
I would add that sometimes people doesn't care to update the data submission with a citation. I have seen a number of datasets in GEO with no citation, but the GEO accession of the dataset was given in publication. So my guess is that if you dig down a little bit for the datasets with no citation, sometimes you may find a corresponding publication.