what is the uniprot RDF entity for "uniprot accession"?
1
1
Entering edit mode
4.2 years ago
mk ▴ 300

All proteins on UniProt have a unique accession number. Ex "O15169" is the accession for human Axin 1.

Other RDF stores referring to proteins on UniProt use this accession (eg Pathway Commons reference)

This document describes the RDF schema for UniProt.

Where is the UniProt accession in this RDF schema?

uniprot rdf • 1.2k views
ADD COMMENT
2
Entering edit mode
4.2 years ago
me ▴ 760

In the UniProt RDF model, the accession is only in the IRI of the form http://purl.uniprot.org/uniprot/${ACCESSION}.

To go from an accession string in pathway commons to a IRI one uses a SPARQL snippet like:

VALUES ?acc { "P05067" }
BIND(IRI(CONCAT("http://purl.uniprot.org/uniprot", ?acc)) AS ?entry)

There are two reasons that we don't have the primary accession as a string in our RDF or SPARQL endpoint.

  1. Avoiding false joins, an UniProt accession. Might also be used to identify something completely else, without the IRI part false joins can lead to wrong results.
  2. Adding a string for each identifier adds hundreds of millions of extra triples and strings in the database which will negatively impact performance and storage.
ADD COMMENT
0
Entering edit mode

thanks for the thorough answer

ADD REPLY

Login before adding your answer.

Traffic: 2164 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6