We would like to improve the browsing services that the ENA provides. Your feedback on how you use the service would be greatly appreciated.
There are two surveys (neither should take you more than 5 minutes)
- The ENA Browser in general: https://ebiux.typeform.com/to/fe319r
- The ENA Advanced Search service: https://ebiux.typeform.com/to/bLxnM7
Please let us know how you use the service and what we could do to make your tasks easier.
thanks
Laura
In context of automating pipelines it would be an essential/useful addition, no argument there.
Having an additional option is always great but what information can one get from ENA that can't be found using Entrez direct (that is a real question :) )?
Sometimes having multiple options (that do more or less the same thing) confuses (new) users and becomes a matter of personal preference for educators.
I believe that ENA has richer set of annotations than NCBI - or at least better organized and more in tune with what biologists want. It is just not easy to get to it if you don't want to use a web browser.
There is a web API but even I found that to be too complicated and too low level - though I am not shy about using command line.
The ENA has programmatic query services
http://www.ebi.ac.uk/ena/browse/programmatic-access
For searching the catalogue there are a number of rest end points
http://www.ebi.ac.uk/ena/browse/search-rest
thanks
neither of which is similar to what I have described -
FWIW I don't consider the link below a useful programmatic access (taken from one of the resources that you mention).
http://www.ebi.ac.uk/ena/data/warehouse/search?query=%22%28instrument_model%3D%22Illumina%20HiSeq%202000%22%20OR%20instrument_model%3D%22Illumina%20HiSeq%201000%22%20OR%20instrument_model%3D%22Illumina%20HiSeq%202500%22%29%20AND%20library_layout%3D%22PAIRED%22%20AND%20library_source%3D%22TRANSCRIPTOMIC%22%22&result=read_run&display=xml&download=xml
As a matter of that that is really what would make the ENA more useful. Instead of obscure lengthy links having a tool that builds us these links.
That is what Entrez Direct does.
As mentioned in the post below, yes we have much simpler queries for programmatically fetching data and performing text searches, but yes the advanced search is complicated. Unfortunately there is no easy way to get around this when wanting to target search terms to specific fields within a record.
We are however wanting to start offering downloadable scripts for performing common actions within ENA next year, therefore this feedback is valuable. If you have key things that you feel these scripts should do, please feel free to start posting them here and we'll see what we can do.
This is something that got clarified in my own mind while writing the Biostar Handbook, teaching courses and observing how people interact with and work with bioinformatics data.
The graphical interfaces such as ENA interfaces are good to explore data but completely inappropriate to reproduce results or to communicate with another fellow scientist of what has actually been done.
https://read.biostarhandbook.com/
In the whole book, which is now clocking in at more than 500 pages there is not a single instance where we would access data via GUI via a GUI interface - and it works like a charm. And I think that is a the right way to go about it!
The main point I am trying to make is we should use the browser to figure out what is there but then there absolutely needs to be a way to get the same information in simple and unambiguous way that allows the users to go on their own explorations.
A lot is made about reproducible research (or lack thereof) and one key component to that is well defined data access. Lengthy URLs that wrap around many rows and are full of weird characters in them are not a good solution - that is impossible to parse visually or to explain to others.
My opinion is that, having a well documented and simple programmatic interface will be the deciding factor in choosing one resource over the other.
We have now announced some new tools for retrieving data from ENA. This includes one that allows you to download sequences more easily as requested here. Please see our news item more more info: http://www.ebi.ac.uk/about/news/service-news/new-tools-download-data-ena