Question

What Features From Publishers Would Help Bioinformatics Folks?

12

Entering edit mode

13.4 years ago

Mary 11k

So I was looking into the new browser/viewer that's been integrated into Elsevier's papers. There was some discussion of it on our blog yesterday. But it's got me thinking about what would be really useful for taking data in a paper further, more easily and fluidly.

What tools and features would you want in an app from publishers that would make the lives of bioinformatics folks easier?

For me, better text mining tool integration would be great. I can't even believe how bad the search is just for authors sometimes. I like the idea of gene lists too, which could be quickly obtained in one fell swoop and used with other tools. It also stuns me that "big data" sets are not always easily linked to where they can be found in a live browser.

EDIT WEDNESDAY Aug 3: Just saw this tweet--a way to get useful stuff added, might be of interest to this community: RT @geneticsblog: . @PLoS and @mendeleycom Call for Apps: http://bit.ly/oc2NGL and http://bit.ly/nHYqNa

text literature • 4.7k views

ADD COMMENT • link updated 13.4 years ago by Alastair Kerr 5.3k • written 13.4 years ago by Mary 11k

3

Entering edit mode

What I want from publishers is a complete overhaul of the entire academic publishing system, starting with the abolition of the traditional journal article and ending with something that resembles Github. Unrealistic, me? ;)

ADD REPLY • link 13.4 years ago by Neilfws 49k

0

Entering edit mode

I agree, I had the same problem finding the actual genome viewer. At some point they say it uses NCBI's genome browser.

ADD REPLY • link 13.4 years ago by brentp 24k

0

Entering edit mode

On our blog there was some discussion of why it may not be visible, and people gave me screen shots. But I'm talking to Elsevier on getting access. I assumed it would be there on the demo ones even if I am not a subscriber.

ADD REPLY • link 13.4 years ago by Mary 11k

0

Entering edit mode

Giggle. Good luck with that @neilfws. But I know you aren't alone.

ADD REPLY • link 13.4 years ago by Mary 11k

Ram · Answer 1 · 2011-08-02

9

Entering edit mode

13.4 years ago

Jeremy Leipzig 22k

The lack of a reproducible research standard is the #1 problem in scientific publishing.

If data and code were married to results using Sweave or other tools then stuff like the Anil Potti incident wouldn't have required a full scale covert investigation to uncover:

http://www.nytimes.com/2011/07/08/health/research/08genes.html

ADD COMMENT • link 13.4 years ago by Jeremy Leipzig 22k

1

Entering edit mode

Great point. It also makes me crazy that I can't get the version number of the software I'm running in certain workflow tools. I keep asking for it anyway.

ADD REPLY • link 13.4 years ago by Mary 11k

1

Entering edit mode

One of those answers I wish I could +2.

ADD REPLY • link 13.4 years ago by Neilfws 49k

1

Entering edit mode

I disagree. The #1 problem in scientific publishing is the lack of universal open access; the #2 problem is that research data (reproducible or otherwise) is delivered in an unstructured format. Sweave, while kewl, is a technological red herring and not the solution to the major problems in scientific publishing.

ADD REPLY • link 13.4 years ago by Casey Bergman 18k

0

Entering edit mode

One thing that I started to do in order for other people to reuse my data and code more easily: Makefiles that reproduce the results from raw data just by calling the default target, sometimes combined with Latex report generation that uses images generated by the scripts.

ADD REPLY • link 13.4 years ago by Michael Schubert ★ 7.1k

0

Entering edit mode

something like this should be required for papers: Reproducible Research: A More General Sweave?

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 13.4 years ago by Ying W ★ 4.3k

0

Entering edit mode

I don't see how making journals open access or easier for robots to read would have prevented the Duke errors. Patients received the wrong chemo because some researchers weren't required to provide reproducible code.

ADD REPLY • link 13.4 years ago by Jeremy Leipzig 22k

0

Entering edit mode

The question here is about what could be changed in the scientific publishing industry to make bioinformatics easier, not what can be done to prevent errors in scientific judgement or malpractice. Bad scientists make mistakes intentionally or unintentionally. It is not realistic to think that just providing a latex file with some R code in it is going to solve this. The problem in this case lies with the researchers and reviewers of this work, not with the publishers or the publishing industry.

ADD REPLY • link 13.4 years ago by Casey Bergman 18k

Ram · Answer 2 · 2011-08-02

4

Entering edit mode

13.4 years ago

Pierre Lindenbaum 164k

I want:

some semantics (RDFa?) in the abstract/paper.Not just:

"NSP3 interacts with RoXan"

but

<span property="http://purl.obolibrary.org/obo/INO_0000026">
<span about="http://www.uniprot.org/uniprot/P03536">NSP3</span>
interacts with
<span about="http://www.uniprot.org/uniprot/Q9UGR2">RoXan</a>
</span>.

a flag saying:

"the software/database/facts described in this paper is/are deprecated"
Requiring authors to contribute to wikipedia with the content of their article. (e.g: http://en.wikinews.org/wiki/RNA_journal_submits_articles_to_Wikipedia)
Requiring authors to post their softwares/tools in a public repository (github, etc...)

ADD COMMENT • link updated 5.1 years ago by Ram 44k • written 13.4 years ago by Pierre Lindenbaum 164k

1

Entering edit mode

Up vote for all but 3)

ADD REPLY • link 13.4 years ago by Jerven ▴ 660

0

Entering edit mode

@Jerven: that's interesting !:-) why not ? ( I've added a ref in my answer )

ADD REPLY • link 13.4 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

For (1), I'd prefer go the full semantic web way and add a proper qualifier as well.

ADD REPLY • link 13.4 years ago by Michael Schubert ★ 7.1k

score 3 · Answer 3 · 2011-08-02

3

Entering edit mode

13.4 years ago

Kim ▴ 100

Clear identification of the materials used or produced. This includes providing complete names or identifiers for the organism or sample, the gene, and the sequences mentioned - use available identifiers as much as possible (especially accession.version).

ADD COMMENT • link 13.4 years ago by Kim ▴ 100

0

Entering edit mode

The accession version is a big question of mine, and how updates would be handled in the browser. If we are looking at giWhatever.2 in a paper, and the link goes to giWhatever.8 later, it could affect ones conclusions about the gene in that paper. Will end users know this?

ADD REPLY • link 13.4 years ago by Mary 11k

0

Entering edit mode

publishers need to set a standard to cite accession.VERSION (or GI) and users need to realize that the accession alone doesn't identify a specific sequence at a point of time - only the combined identifier or GI can do that.

If you query NCBI with accession.version, and it is an older record, the results page does include a message indicating that and offering the choice to see that version or see the newest record.

ADD REPLY • link 13.4 years ago by Kim ▴ 100

score 2 · Answer 4 · 2011-08-03

2

Entering edit mode

13.4 years ago

Alastair Kerr 5.3k

In the 25-year SwissProt conference 2007, Amos Bairoch proposed that all papers should be sent to publishers with some mark-up language to identify genes, proteins and other data. I tried googling to see if I could find this project but had no luck. Basically it would remove the need for natural language processing on new papers and allow automatic data integration into databases such as those in imex as well as better custom searching.

Not so much a killer app, but a better method of publishing science. I'd love to see this implemented in some form but I guess forming standards between publishers would be a major headache.

ADD COMMENT • link 13.4 years ago by Alastair Kerr 5.3k

0

Entering edit mode

Yeah, I've been in a variety of rooms with database folks who crave standards that would help them extract what they need to get into their databases. Since the mid-90s. Progress has been...um...slow. They have tried to work with publishers with mixed success.

ADD REPLY • link 13.4 years ago by Mary 11k

score 1 · Answer 5 · 2011-08-02

1

Entering edit mode

13.4 years ago

Kim ▴ 100

About the Elsevier genome display - it is an embedded browser provided by NCBI. See here: http://www.ncbi.nlm.nih.gov/projects/sviewer/embedded.html

ADD COMMENT • link 13.4 years ago by Kim ▴ 100

0

Entering edit mode

Yeah, but how it is implemented in the context of the papers, how it will be maintained and updated, and more--still need to be explored in situ.

ADD REPLY • link 13.4 years ago by Mary 11k