This is a follow-up question of this one. There I learned how to retrieve certain information from uniprot. I would now like to extend the sparql query shown below in such a way that I also retrieve the associated rhea reactions for each protein, ideally:
- the reaction ID
- Its substrate and product IDs (as e.g. ChEBI identifiers) along with their stoichiometric factor and charge
If one looks at example 19 at rhea's sparql endpoint, this connection seems to be possible but I do not know how to connect the two databases. One way might be to go via the EC numbers that are retrieved; I posted a follow-up question here.
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX taxon:<http://purl.uniprot.org/taxonomy/>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
SELECT
?protein
?taxon
?name
?kos
?ecs
WHERE
{
?protein a up:Protein .
?protein up:reviewed true . # have to be reviewed
?taxon a up:Taxon .
?taxon up:scientificName ?name .
VALUES ?taxonlist { taxon:3702 taxon:562 }
{
?taxon rdfs:subClassOf ?taxonlist .
?protein up:organism ?taxon .
} UNION {
?protein up:organism ?taxonlist .
}
{
?protein up:existence up:Evidence_at_Protein_Level_Existence .
} UNION {
?protein up:existence up:Evidence_at_Transcript_Level_Existence .
}
{
SELECT ?protein (GROUP_CONCAT(?ec; SEPARATOR=", ") AS ?ecs)
WHERE{
?protein up:enzyme|((up:component|up:domain)/up:enzyme) ?ec
} GROUP BY ?protein
}
OPTIONAL {
SELECT ?protein (GROUP_CONCAT(?ko; SEPARATOR=", ") AS ?kos)
WHERE{
?protein rdfs:seeAlso ?ko .
?ko up:database <http://purl.uniprot.org/database/KO>
} GROUP BY ?protein
}
}
Tagging: Elisabeth Gasteiger