Reactome Clarification
5
1
Entering edit mode
10.8 years ago
sam.neaves ▴ 20

Hello,

I am trying to understand how the biopax 3 format relates to the diagrams. (I do have the manual http://www.biopax.org/release/biopax-level3-documentation.pdf ) I am starting with manually going through the relations of a small pathway but I want to have some code that will do this automatically once I have my head wrapped around it. I am looking at the simple pathway 1472: "Regulation of Apoptosis". And the simple subpathway 1473 "Regulation of activated PAK-2p34 by proteasome mediated degradation".

This subpathway has two reactions, BiochemicalReaction6535 and BiochemicalReaction6536 and two pathway steps, 8092 and 8091.

I thought I understood that reactions can be controlled and that this is indicated by the control class. So we can look up that Catalysis1 controls BiochemicalReaction2 for example. However if I make a query for what catalyst controls BiochemicalReaction6536 it returns no results -however in the diagram it is clearly shown to be catalyzed by "26s Proteasome".

If I lookup pathway step 8092 it has a stepprocess BiochemicalReaction6536 and a step process for catalysis102. If I then lookup what reactions are controlled by catalysis102 it returns BiochemicalReaction271 and that its controller is complex343 (26s Proteasome) . Is the BiochemicalReaction271 some how related to BiochemicalReaction6535? If so how is this indicated? Why does it not just say that BiochemicalReaction6536 is controlled by catalysis102?

If I want to computationally extract the network do I say : There is a catalysis link between a node and a reaction IF the reaction is controlled by that catalysis or IF they both appear in the same pathway step.

Or is this incorrect/not sufficient ? i.e. is there even more complications ;)

Would I then do a similar thing for TemplateReactionRegulation and Modulation links?

Thanks in advance for any help.

• 3.2k views
ADD COMMENT
2
Entering edit mode
10.8 years ago
B. Arman Aksoy ★ 1.2k

Hi Sam,

Can you provide a link to the BioPAX file that you are working with? I understand that this is a follow-up question to your previous question about Reactome; but I was not able to find the IDs you mentioned in the BioPAX 3 export from Reactome -- full RDF IDs will really help following your argument.

But I will try to answer your question as much as I can understand:

  1. You are right about catalysis -> reaction links. They are directional and the regulation information is contained on the catalysis/control reactions side. So if you are only given the reactions, the object itself does not contain information about its regulation; you need to run a second query to extract all catalysis reactions of which "controlled" is the reaction.
  2. The computational extraction of a BioPAX element, e.g. a Pathway, is what we call a "fetch" operation, meaning that you want all the data that is necessary to define that reaction from a model. If you are not working with native BioPAX-compatible libraries, than this is somewhat harder to accomplish manually -- because as you mentioned, there are many many complications.

I suggest you have a look at Paxtools and Paxtools User Guide which provides lots of information about basic BioPAX file handling -- this also includes Fetcher, Completer and Traverse methods, all of which are, IMHO, of interest to you in terms of what you want to accomplish.

We have also implemented most of these operations as part of Pathway Commons where you can get a self-consistent BioPAX file for any RDF ID you provide. For example:

If you can provide more details about what tool you are using for queries, which file you are working with and what are the full RDF IDs of the objects you refer, I will be happy to provide more information.

Hope this helps,

ADD COMMENT
2
Entering edit mode
10.8 years ago
rodche ▴ 60

Not a complete answer, just a brief comment.

That was a bug in the Reactome BioPAX export script (there should be more Catalysis instances; they're fixing now, and next release v48 won't have the issue). Keep in ming, most of URIs (RDF IDs) in the Reactome BioPAX (both - batch in the download and online, per pathway diagram) are basically generated again every time; so it's hard to talk about a model if we do not have exactly the same file in hands.

All the best, Igor.

ADD COMMENT
0
Entering edit mode

thanks for the details, Igor :)

ADD REPLY
1
Entering edit mode
10.8 years ago
sam.neaves ▴ 20

Thanks for your help in this. It is appreciated.

To give you more detail. I am using the 'Homo Sapiens.owl' file available from: http://www.reactome.org/download/current/biopax.zip .

I am using SWI-Prologs RDF-Parser http://www.swi-prolog.org/pldoc/package/rdf2pl.html to bring the file into a Prolog knowledge base.

So as an example of what I am trying to understand: I am looking at the pathway "Regulation of activated PAK-2p34 by proteasome mediated degradation" RDF:ID=Pathway1474 Diagram: http://www.reactome.org/PathwayBrowser/#DIAGRAM=169911&ID=211733&PATH=109581

By inspecting the diagram It can be seen that there is a reaction "Regulation of activated PAK-2p34 by proteasome mediated degradation" which is Catalyzed by "26S proteasome."

The XML snippet would be:

 <bp:Pathway rdf:ID="Pathway1473">
<bp:pathwayComponent rdf:resource="#BiochemicalReaction6535"/>
<bp:pathwayComponent rdf:resource="#BiochemicalReaction6536"/>
<bp:pathwayOrder rdf:resource="#PathwayStep8092"/>
<bp:pathwayOrder rdf:resource="#PathwayStep8091"/>
<bp:organism rdf:resource="#BioSource2"/>
<bp:displayName rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">Regulation of activated PAK-2p34 by proteasome mediated degradation</bp:displayName>
<bp:xref rdf:resource="#UnificationXref80321"/>
<bp:xref rdf:resource="#UnificationXref80322"/>
<bp:comment rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">Stimulation of cell death by PAK-2  requires the generation and stabilization of the caspase-activated form, PAK-2p34 (Walter et al., 1998;Jakobi et al., 2003).  Levels of proteolytically activated PAK-2p34 protein are controlled by ubiquitin-mediated proteolysis. PAK-2p34 but not full-length PAK-2 is degraded  by the 26 S proteasome (Jakobi et al., 2003). It is  not known whether ubiquitination and degradation of PAK-2p34 occurs in the cytoplasm or in the nucleus.</bp:comment>
<bp:xref rdf:resource="#PublicationXref13720"/>
<bp:xref rdf:resource="#PublicationXref7487"/>
<bp:xref rdf:resource="#RelationshipXref749"/>
<bp:dataSource rdf:resource="#Provenance1"/>
<bp:comment rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">Authored: Jakobi, R, 2008-02-05 11:04:14</bp:comment>
<bp:comment rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">Reviewed: Chang, E, 2008-05-21 00:05:41</bp:comment>
<bp:comment rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">Edited: Matthews, L, 2008-02-03 20:50:13</bp:comment>
<bp:comment rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">Edited: Matthews, L, 2008-06-12 00:23:53</bp:comment>

</bp:pathway>

Here you can see that there are two Biochemical Reactions in the pathway. RDF:ID=BiochemicalReaction6535 & RDF:Id=BiochemicalReaction6536. And there also two pathway steps: Rdf:Id=PathwayStep8092 & RDF:ID=PathwayStep8091.

I use Prolog to query the RDF triples (I am not sure if you are familiar with Prolog apologies if not!) . For example I have a predicate:

controlled_reaction(Controller,Controlled_Reaction,Control):-
    rdf(Control,'http://www.w3.org/1999/02/22-rdf-syntax-ns#type','http://www.biopax.org/release/biopax-level3.owl#Control'),
    rdf(Control, 'http://www.biopax.org/release/biopax-level3.owl#controller', Controller),
    rdf(Control, 'http://www.biopax.org/release/biopax-level3.owl#controlled', Controlled_Reaction).

And for catalyzed reactions :

catalyzed_reaction(Controller, Controlled_Reaction,Catalyst):-
    rdf(Catalyst, 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type','http://www.biopax.org/release/biopax-level3.owl#Catalysis'),
    rdf(Catalyst,'http://www.biopax.org/release/biopax-level3.owl#controller', Controller),
    rdf(Catalyst,'http://www.biopax.org/release/biopax-level3.owl#controlled', Controlled_Reaction).

For example I can use thsi to find that for rdf:ID="Catalysis1" the controller is RDF:ID="Protein5" and the controlled reaction is RDF:ID="BiochemicalReaction2" the corresponding xml snippet is:

<bp:Catalysis rdf:ID="Catalysis1">
<bp:controller rdf:resource="#Protein5"/>
<bp:controlled rdf:resource="#BiochemicalReaction2"/>
<bp:controlType rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">ACTIVATION</bp:controlType>
<bp:xref rdf:resource="#RelationshipXref1"/>
<bp:xref rdf:resource="#RelationshipXref2"/>
<bp:dataSource rdf:resource="#Provenance1"/>

</bp:catalysis>

Now if I query for RDF:ID="BiochemicalReaction6536" it will return no results.

It seems that the information about the catalyst for the reaction is contained in the pathway step.

<bp:PathwayStep rdf:ID="PathwayStep8092">
<bp:stepProcess rdf:resource="#BiochemicalReaction6536" />
<bp:stepProcess rdf:resource="#Catalysis102" />

</bp:pathwaystep>

This shows a link between RDF:Id=BiochemicalReaction6536 & RDF:ID=Catalysis102.

If I look up RDF:ID=Catalysis102 the corresponding xml snippet is:

<bp:Catalysis rdf:ID="Catalysis102">
<bp:controller rdf:resource="#Complex343"/>
<bp:controlled rdf:resource="#BiochemicalReaction271"/>
<bp:controlType rdf:datatype="&lt;a href=" http:="" www.w3.org="" 2001="" XMLSchema#string"="" rel="nofollow">http://www.w3.org/2001/XMLSchema#string">ACTIVATION</bp:controlType>
<bp:xref rdf:resource="#RelationshipXref144"/>
<bp:xref rdf:resource="#RelationshipXref145"/>
<bp:dataSource rdf:resource="#Provenance1"/>

</bp:catalysis>

If I look up RDF:ID="Complex343" I can see that this is indeed "26S proteasome". But the information about RDF:ID="Catalysis102" states the controlled reaction is RDF:ID="BiochemicalReaction271". And that the control type is "ACTIVATION".

So my question is: What is the relationship between RDF:ID="BiochemicalReaction271" and RDF:ID="BiochemicalReaction6536"?

Do I infer that RDF:ID="BiochemicalReaction6536" is a type of RDF:ID="BiochemicalReaction271" because of the pathwayStep rdf:ID="PathwayStep8092" ? That is : should they have the same properties? Significantly the control type. i.e. in this case "ACTIVATION"?

To state another way:

By using the pathway steps the results match the diagram and seem to make sense, but I am uncertain if I am capturing the information correctly. For example the information for RDF:ID=Catalysis102 states that the controller is rdf:id=Complex343 and the controlled reaction is rdf:id=BiochemicalReaction271 and that the control type is "ACTIVATION". Do I say that because of rdf:ID="PathwayStep8092" contains the step processes of rdf:id= BiochemicalReaction6536 & rdf:id="Catalysis102" that the reaction rdf:id= BiochemicalReaction6536 is also controlled by rdf:id=Complex343" and that the control type is "ACTIVATION". And that the control type will always be 'inherited' in this manner?

Is the relationship between RDF:ID="BiochemicalReaction6536" & RDF:ID="BiochemicalReaction271" something to do with the section on the Reactome pathway viewer website that talks about the reaction has been "Inferred from another species" or is this unrelated?

Sorry this is quite long and I am not sure if it will be much clearer- but hopefully it is!

Thank once again for your time and links. Paxtools does indeed look useful but I am trying to have this in Prolog for a number of reasons, so it is not completely suitable. Although reading the documentation has helped me to understand the format so thank you.

ADD COMMENT
0
Entering edit mode
10.8 years ago
B. Arman Aksoy ★ 1.2k

Hi Sam,

Thanks for the all the detailed explanation! I am not familiar with Prolog, but I was able to follow your argument using the RDF IDs.

It looks like there is some discrepancy between the internal Reactome model and the exported BioPAX model -- that is probably due to a bug in their BioPAX exporter. You are right about the #Catalysis102 reaction, it should have its controlled property set to #BiochemicalReaction6536. This information cannot be captured via PathwaySteps which only should provide information about order of reactions.

You can also see that this is a bug, because when you check the actual controlled reaction, it has nothing to do with the original pathway at all. I think you should report this case to Reactome people -- they are pretty responsive about this type of bug reports.

Nice catch!

ADD COMMENT
0
Entering edit mode
10.8 years ago
sam.neaves ▴ 20

Thanks for following my steps. Although I don't think it is a mistake or bug in the Biopax because most of the catalyst reactions are like this! Not just that one!

I will try emailing someone on the Reactome team and hope for a reply. If I get one or I increase my understanding ill post an update here.

Thanks again.

ADD COMMENT
0
Entering edit mode

turns out this is a bug on the Reactome side -- I think they are going to fix it soon.

ADD REPLY

Login before adding your answer.

Traffic: 2073 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6