Question

How to download all of the metadata associated with a particular EBI sample informatically?

0

Entering edit mode

5.8 years ago

O.rka ▴ 750

Here is the example of one of the samples I am interested in but there are hundreds more.

https://www.ebi.ac.uk/metagenomics/samples/ERS488919

I tried using Python to scrape the HTML but it's dynamic so I'll need to try another option.

Is there any tool or method available to give a URL or sample identifier and output a table with the following metadata?

Chlorophyll sensor:0.17638 mg Chl/m3
Citation:tbd
ENA checklist:ENA TARA (ERC000030)
Environmental package:water
Event date/time end:2010-03-18T12:32
Event date/time start:2010-03-18T11:33
Event label:TARA_20100318T1133Z_039_EVENT_PUMP
Further details:tbd
Geographic location (depth):25 m
Instrument model:Illumina HiSeq 2000
Last update date:2014-05-01Z
Latitude end:18.5679 DD
Longitude end:66.4581 DD
Marine region:n/a
Nitrate sensor:-0.611888 µmol/L
Oxygen sensor:192.95875 µmol/Kg
Project name:Tara Oceans expedition (2009-2013)
Protocol label:BACT_NUC-DNA(100L)_W1.6-20
Salinity sensor:36.332317 psu
Sample collection device:PUMP (High Volume Peristaltic Pump) with ECOTriplet
Sample status:This version can be used to provide data discovery services
Sampling campaign:TARA_20100309Z
Sampling platform:SV Tara
Sampling station:TARA_039
Size fraction lower threshold:1.6
Size fraction upper threshold:20
Temperature:26.812225 °C

sequencing next-gen • 1.3k views

ADD COMMENT • link updated 5.8 years ago by GenoMax 152k • written 5.8 years ago by O.rka ▴ 750

score 2 · Accepted Answer · 2019-09-30

Use the API provided here. There are multiple entry points. I tried to figure out which one generates the example you posted but they must be doing some additional post-processing. In any case see the following.

Example URL: https://www.ebi.ac.uk/metagenomics/api/v1/samples?id=ERS488919

generates the following (truncated for brevity, there are 7000+ pages):

GET /metagenomics/api/v1/samples?id=ERS488919

HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/vnd.api+json
Vary: Accept

{
    "links": {
        "first": "https://www.ebi.ac.uk/metagenomics/api/v1/samples?id=ERS488919&page=1",
        "last": "https://www.ebi.ac.uk/metagenomics/api/v1/samples?id=ERS488919&page=7089",
        "next": "https://www.ebi.ac.uk/metagenomics/api/v1/samples?id=ERS488919&page=2",
        "prev": null
    },
    "data": [
        {
            "type": "samples",
            "id": "ERS3721824",
            "attributes": {
                "sample-metadata": [
                    {
                        "unit": null,
                        "key": "instrument model",
                        "value": "Illumina MiSeq"
                    },
                    {
                        "unit": null,
                        "key": "ENA checklist",
                        "value": "ERC000011"
                    },
                    {
                        "unit": null,
                        "key": "last update date",
                        "value": "2019-09-04"
                    }
                ],