Forum:Reuse of publicly available datasets
3
1
Entering edit mode
7.8 years ago
sanathoi ▴ 10

Hi, I am novice to RNA-Seq and currently doing some RNA-Seq Analysis in publicly available reads. I was wondering whether is it possible to publish a work using those publicly available reads.

publication RNA-Seq sequencing • 2.4k views
ADD COMMENT
2
Entering edit mode

if it's public then it's public, you add citation to give credit when it's due and that's it. Whole metanalysis approaches make a living from this or benchmarking papers

ADD REPLY
2
Entering edit mode

This post can use a different title.

Reproductivity of publicly available reads

Perhaps "Fair-use of publicly available reads" or "Re- or Meta-analysis of publicly available reads" would be more suitable?

ADD REPLY
4
Entering edit mode
7.8 years ago
GenoMax 147k

Publicly available data sometime still carries restrictions. These may be presented in a "click through" agreement that you agree to (don't we all) as you breeze by. This is important for data that may be still unpublished/under analysis. A common restrictions is that you can't use that data for an independent publication (especially if it is still unpublished).

In short, you would want to do due diligence, before assuming an accessible dataset is public and running with it.

ADD COMMENT
2
Entering edit mode

Interesting point. Say you download all assemblies from Trace in search for a single gene, or download raw data from SRA and make my own assembly?

There are also several documents on open data policies:

e.g. the Fort Lauderdale meeting discussing community resource projects, the resulting NHGRI policy statement, and the Toronto statement.

ADD REPLY
1
Entering edit mode

download raw data from SRA and make my own assembly?

That may run afoul of this tenet from Toronto statement.

Respecting the scientific etiquette that allows data producers to publish the first global analyses of their data set
ADD REPLY
1
Entering edit mode

I think there are two situations where I still could use it but not sure about 1.:

  1. If the data has not been published yet, but I am looking only for 1 gene (not a global analysis)
  2. If the data has been published, I am free to use it for whatever I like.
ADD REPLY
1
Entering edit mode

Case #1 could be considered "fair-use" as long as you indicate the exact search you did on the "trace archive" so it could be reproduced.

ADD REPLY
0
Entering edit mode

I see another problem here, and I hope it is not a real one.... If I search whole Trace archive, and only used assemblies with citations, how would I put in 1000+ citations in that article?

ADD REPLY
1
Entering edit mode

Maybe like we do in population genetics. Hard to cite >300 papers, so we just put the citation in supplements, and short cite them alongisde one table with data, other solution is just to cite database, and no so far has pointed out, citing just UCSC as wrong doing

ADD REPLY
3
Entering edit mode
7.8 years ago
Emily 24k

If in doubt, contact the owners of the data.

ADD COMMENT
2
Entering edit mode
7.8 years ago
Benn 8.3k

I guess it is. Of course explain in your methods of your paper how you got your data and what you did with it.

ADD COMMENT
0
Entering edit mode

Indeed, you'll have to cite and acknowledge your data sources and see if there is a license agreement which you have to follow.

ADD REPLY
0
Entering edit mode

yea that last part is quite important. Though I think it would be hard to lie hands just like that on a licensed data set, but nevertheless, make sure, you fullfill the agreement (it often is, that fully public data set is released just like that, and for the sake of scientific publication you can use it as it is, but very often those datasets have some parts not released fully publicly, and you have to get the permission to obtain and use them, as they are intended to be)

ADD REPLY
0
Entering edit mode

Thanks everyone for clearing doubts. Indeed, citing and acknowledgement is most important in scientific publication.

ADD REPLY

Login before adding your answer.

Traffic: 2490 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6