How can I download data from plants.ensembl.org using wget in biomart?
2
0
Entering edit mode
5.9 years ago
utsafar ▴ 80

using commands like this:

wget -O result.txt 'http://www.ensembl.org/biomart/martservice?query=<Query virtualSchemaName="default" formatter="TSV" header="0" uniqueRows="0" count="" datasetConfigVersion="0.6"><Dataset name="hsapiens_gene_ensembl" interface="default"><Filter name="ensembl_gene_id" value="ENSG00000139618"/><Attribute name="ensembl_gene_id"/><Attribute name="ensembl_transcript_id"/><Attribute name="hgnc_symbol"/><Attribute name="uniprotswissprot"/></Dataset></Query>'

I can easily download data from ensembl.org, but when I try to use the same commands for plants.ensembl.org I just go to http://plants.ensembl.org/index.html.

How can I solve my problem?

ensembl wget biomart • 3.2k views
ADD COMMENT
0
Entering edit mode

Mike and Astrid, thanks to your helps I found my mistakes:

This is my wrong command:

wget -O result.txt 'http://www.plants.ensembl.org/biomart/martview/martservice?query=<Query virtualSchemaName="plants_mart" formatter="CSV" header="0" uniqueRows="0" count="" datasetConfigVersion="0.6"><Dataset name="atauschii_eg_gene" interface="default"><Attribute name="ensembl_peptide_id"/></Dataset></Query>'

First mistake: I forgot to remove 'www.' from my url

2nd mistake: '/martview' will be removed

So this is the example correct command to get aegilops tauschii gene IDs from plants.ensembl.org:

wget -O result.txt 'http://plants.ensembl.org/biomart/martservice?query=<Query virtualSchemaName="plants_mart" formatter="CSV" header="0" uniqueRows="0" count="" datasetConfigVersion="0.6"><Dataset name="atauschii_eg_gene" interface="default"><Attribute name="ensembl_gene_id"/></Dataset></Query>'
ADD REPLY
3
Entering edit mode
5.9 years ago
Mike Smith ★ 2.1k

This is a pretty unusual way to query Ensembl, typically people use the Perl API or biomaRt R-package, which should make constructing queries a lot easier, but if this suits your workflow then great.

I don't know how this is different from what you're trying, but this works for me:

wget -O result.txt 'http://plants.ensembl.org/biomart/martservice?query=<Query virtualSchemaName="plants_mart" formatter="TSV" header="0" uniqueRows="0" count="" datasetConfigVersion="0.6">     
<Dataset name="athaliana_eg_gene" interface="default">
    <Filter name="ensembl_gene_id" value="AT1G01010"/>
    <Attribute name="ensembl_gene_id"/>
    <Attribute name="ensembl_transcript_id"/>
    <Attribute name="external_gene_name"/>
</Dataset>
</Query>'

You can paste the URL into a browser to check its working, and here's the contents of the output file:

% cat result.txt 
AT1G01010   AT1G01010.1 NAC001
ADD COMMENT
3
Entering edit mode
5.9 years ago

You could also look into the Ensembl Genomes REST API as an alternative: http://rest.ensemblgenomes.org

E.g. this endpoint: http://rest.ensemblgenomes.org/documentation/info/lookup rest.ensemblgenomes.org/lookup/id/AT1G01010?content-type=application/json;expand=1

The endpoint documentation includes wget examples.

ADD COMMENT

Login before adding your answer.

Traffic: 2026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6