Mapping exon position on UniProt entries
2
2
Entering edit mode
8.0 years ago

Hello there,

My current project requires the information about exon start/end on UniProt entries. My Uniprot entries are mapped to Ensembl transcript ids. I would like to know if there is a straightforward way to obtain exon start/end mapped on Uniprot entry (if necessary, given the Ensembl transcript id) via the REST service.

Thanks

Ensembl exon UniProt mapping Ensembl REST • 2.2k views
ADD COMMENT
1
Entering edit mode
8.0 years ago
me ▴ 760

I don't know if you can do this via the Ensembl rest service but it can be done by combining the UniProt sparql endpoint and the Ensembl sparql endpoints. Although if you have the Ensembl transcript ids you can use the ensembl one directly without involving the UniProt one by using the part in the service clause.

PREFIX up:<http://purl.uniprot.org/core/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 
PREFIX core:<http://purl.uniprot.org/core/> 
PREFIX uniprotkb:<http://purl.uniprot.org/uniprot/> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX ensemblprotein: <http://rdf.ebi.ac.uk/resource/ensembl.protein/>
PREFIX ensemblterms: <http://rdf.ebi.ac.uk/terms/ensembl/>
PREFIX sio: <http://semanticscience.org/resource/>

SELECT ?protein ?transcript ?exon ?order ?length {
  BIND(uniprotkb:P05067 as ?protein) 
 ?protein rdfs:seeAlso ?transcript .
 ?transcript core:database <http://purl.uniprot.org/database/Ensembl> .
 SERVICE <http://www.ebi.ac.uk/rdf/services/ensembl/sparql/>{
 ?transcript obo:SO_translates_to ?peptide .
 ?peptide a ensemblterms:protein .
 ?transcript obo:SO_has_part ?exon;
           sio:SIO_000974 ?orderedPart .
 ?orderedPart sio:SIO_000628 ?exon .
   ?exon faldo:location ?location . 
   ?location faldo:begin ?bf . ?bf faldo:position ?begin .
   ?location faldo:end ?ef . ?ef faldo:position ?end .
   ?orderedPart sio:SIO_000300 ?order .
  }
  BIND(ABS(?end - ?begin) as ?length)
}

This gets the exons in Ensembl for the UniProt entry P05067. It first gets the corresponding transcripts then for each transcript we go the ensembl endpoint. There we double check the transcript translates to a peptide. The transcript has ordered parts in the Sequence ontology. This is then linked to the exon. Then we get the location of the exon on the chromosome and ask for the begin and end. Then we calculate the length by figuring out the difference between those and make that a an positive value using the ABS function.

ADD COMMENT
1
Entering edit mode
8.0 years ago
kidehen ▴ 10

Jerven,

Here's a live SPARQL Query results page link, using the Uniprot SPARQL Query Service, with regards to your answer above. Net effect, this puts readers a mouse click away a page demonstrating the effect of your Federated SPARQL Query :)

ADD COMMENT

Login before adding your answer.

Traffic: 1810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6