Summer 2022, there's a Python package for querying UniProt's new REST API, by Michael Milton(multimeric), called Unipressed.
Announcement:
Unipressed Github repo.
Unipressed documentation.
Demonstration Code Using Unipressed (consistent with examples in earlier posts):
from unipressed import UniprotkbClient
for record in UniprotkbClient.search(
query={
"or_": [
{"ec": "3.1.3.9"},
{"ec": "2.7.1.2"},
],
"and_": [
{"organism_id": "9606"},
]
},
#fields=["length", "gene_names"]
).each_record():
display(record)
The documentation for Unipressed, presently under 'Advantages' it says it supports formats json, tsv, list, and xml:
Here is choosing tsv
format:
from unipressed import UniprotkbClient
for record in UniprotkbClient.search(
query={
"or_": [
{"ec": "3.1.3.9"},
{"ec": "2.7.1.2"},
],
"and_": [
{"organism_id": "9606"},
]
},
format="tsv",
fields=["accession","gene_names", "length"]
).each_record():
display(record)
That results in:
{'Entry': 'Q9NQR9', 'Gene Names': 'G6PC2 IGRP', 'Length': '355'}
{'Entry': 'P35575', 'Gene Names': 'G6PC1 G6PC G6PT', 'Length': '357'}
{'Entry': 'Q9BUM1', 'Gene Names': 'G6PC3 UGRP', 'Length': '346'}
{'Entry': 'P35575-2', 'Gene Names': 'G6PC1 G6PC G6PT', 'Length': '176'}
{'Entry': 'Q9NQR9-2', 'Gene Names': 'G6PC2 IGRP', 'Length': '102'}
{'Entry': 'Q9NQR9-3', 'Gene Names': 'G6PC2 IGRP', 'Length': '154'}
{'Entry': 'A0A024R1U9', 'Gene Names': 'G6PC hCG_16953', 'Length': '359'}
(I went with a very simple form of the output there to show human readable results here. To actually save data as the TSV-formatted text, you can adapt the approach used at the end of Michael Milton's (multimeric) reply to this post below, as I do with the above example code here.)
This gives seven hits as opposed to the four shown in the direct results at the site in the August 31, 2022 post by @roder.thomas. This is because this query results include the isoforms in the primary accessions of hits, and so in addition to the four shown in the August 31, 2022 post by @roder.thomas:
Q9NQR9
P35575
Q9BUM1
A0A024R1U9
You also see listed:
P35575-2
Q9NQR9-2
Q9NQR9-3
Those isoforms are listed under the section 'Sequence & Isoforms' in the entry pages accessible from the screen in the August 31, 2022 post by @roder.thomas.
You can filter those isoforms to get the 4 seen in the direct access by filtering out any where there's a dash in in the name, like so:
from unipressed import UniprotkbClient
collected=[]
for record in UniprotkbClient.search(
query={
"or_": [
{"ec": "3.1.3.9"},
{"ec": "2.7.1.2"},
],
"and_": [
{"organism_id": "9606"},
]
},
fields=["length", "gene_names"]
).each_record():
collected.append(record)
collected = [x for x in collected if "-" not in x["primaryAccession"]]
XML Format Example:
The original post in particular asked about downloading the results in XML format. And Unipressed has that built in already. Here some accessing & printing of data stored in the XML record object is done to show something human readable:
from unipressed import UniprotkbClient
for record in UniprotkbClient.search(
query={
"or_": [
{"ec": "3.1.3.9"},
{"ec": "2.7.1.2"},
],
"and_": [
{"organism_id": "9606"},
]
},
format="xml",
).each_record():
#Show XML object as string by uncommenting out the next two lines & deleting everything after those lines
#from xml.etree import ElementTree # from https://stackoverflow.com/a/48671499/8508004
#print(ElementTree.tostring(record, encoding='unicode'))
#Below based on [Processing XML in Python — ElementTree:A Beginner’s Guide](https://towardsdatascience.com/processing-xml-in-python-elementtree-c8992941efd2)
# slice `[28:]` added to remove `{http://uniprot.org/uniprot}` from the front of tags
#[print(elem.tag[28:]) for elem in record.iter()]
#[print(child.tag, child.attrib) for child in record]
[print(elem.tag[28:], elem.attrib, elem.text) for elem in record.iter('{http://uniprot.org/uniprot}fullName')]
[print(elem.tag[28:], elem.attrib, elem.text) for elem in record.iter('{http://uniprot.org/uniprot}ecNumber')]
[print(elem.tag[28:], elem.attrib) for elem in record.iter('{http://uniprot.org/uniprot}proteinExistence')]
print("*"*60)
Results in:
fullName {} Glucose-6-phosphatase 2
fullName {} Islet-specific glucose-6-phosphatase catalytic subunit-related protein
ecNumber {} 3.1.3.9
proteinExistence {'type': 'evidence at protein level'}
************************************************************
fullName {'evidence': '36'} Glucose-6-phosphatase catalytic subunit 1
fullName {} Glucose-6-phosphatase
fullName {} Glucose-6-phosphatase alpha
ecNumber {'evidence': '9 12 16 25'} 3.1.3.9
proteinExistence {'type': 'evidence at protein level'}
************************************************************
fullName {} Glucose-6-phosphatase 3
fullName {} Glucose-6-phosphatase beta
fullName {} Ubiquitous glucose-6-phosphatase catalytic subunit-related protein
ecNumber {} 3.1.3.9
proteinExistence {'type': 'evidence at protein level'}
************************************************************
fullName {'evidence': '5'} Isoform 2 of Glucose-6-phosphatase catalytic subunit 1
fullName {} Glucose-6-phosphatase
fullName {} Glucose-6-phosphatase alpha
ecNumber {'evidence': '1 2 3 4'} 3.1.3.9
proteinExistence {'type': 'evidence at protein level'}
************************************************************
fullName {} Isoform 2 of Glucose-6-phosphatase 2
fullName {} Islet-specific glucose-6-phosphatase catalytic subunit-related protein
ecNumber {} 3.1.3.9
proteinExistence {'type': 'evidence at protein level'}
************************************************************
fullName {} Isoform 3 of Glucose-6-phosphatase 2
fullName {} Islet-specific glucose-6-phosphatase catalytic subunit-related protein
ecNumber {} 3.1.3.9
proteinExistence {'type': 'evidence at protein level'}
************************************************************
fullName {'evidence': '4'} Glucose-6-phosphatase
ecNumber {'evidence': '4'} 3.1.3.9
proteinExistence {'type': 'inferred from homology'}
************************************************************
The UniProt IDmapping doesn't actually support EC numbers. For performance reasons, databases where the mapping relationship to UniProtKB identifiers is one-to-many, e.g. GO, InterPro or PubMed, are not supported. There is a note about this in the help page http://www.uniprot.org/help/uploadlists.
You can however build RESTful queries of the form
http://www.uniprot.org/uniprot/?query=(ec%3A+3.1.3.9+or+ec%3A2.7.1.2)+organism%3A9606&format=xml
You could also use the tab-delimited format:
http://www.uniprot.org/uniprot/?query=(ec%3A+3.1.3.9+or+ec%3A2.7.1.2)+organism%3A9606&format=tab&columns=id,entry_name,protein_names,genes,comment(KINETICS)
This solution no longer works as noted by @roder.thomas.
Elisabeth Gasteiger - Is there an update that can be posted instead? Otherwise this answer should be moved to a comment for historical reference.
Note: A new answer has been added so this originally accepted answer has been moved to a comment for reference. It is not longer valid.