How can I extract all the genes that are regulated by a specific gene of interest? So all the genes that are direct downstream targets in all the pathway where the gene of interest is? My idea was to check in know databases like KEGG or Reactome or Biogrid. So I need your help/advices for this.
For me this is an extremely interesting and relevant question. Basically what you want to do is precisely what we are working on for pathways from WikiPathways and indeed Reactome. Our idea was to first make sure we can actually described the directed interactions in the pathways. There are different solutions for that that use things as simple as directed arrows, but also MIM and SBGN interactions. We then export this as RDF and make it available from a SPARQL endpoint. We expect a big update for that, with a.o. Reactome content and example SPARQL queries for that directionality, in the next two weeks or so.
Our own approach is to then load that into the Open PHACTS cache and implement new API calls. But you could of course just SPARQL the data directly. Note that in principle you should be able to use our KEGG converter to do the same thing for KEGG, but I am not sure whether that will work well enough.
There is however another, complementary approach where you would use database knowledge about protein-protein interactions (e.g. from IntAct, which you could consider downstream), transcription factor regulation (e.g. from TFe, normally upstream unless your gene codes a transcription factor itself) and miRNA target data (from various resources) and all kinds of regulation from ENCODE. We have a lot of that kind of information available for use in Cytoscape using the RegINs (regulatory interaction networks) for CyTargetLinker. These RegINs basically contain what you need but you might need some new analytical approaches to use the directionality.
Thanks Chris for your answer. I'll try to write a SPARQL query for my problem. I'll write it down here when it works ;)
@Chris: I try a simple query to test SPARQL: http://www.wikipathways.org/index.php/Help:WikiPathways_Sparql_queries#Get_all_pathways_with_a_particular_gene but it gives me an error:
Is that normal? Maybe an upgrade of the server is ongoing now?
The problem was that during an update of the triple store the prefix definitions were lost. These were put back now. So these should work again.
New WikiPathways RDF and Reactome RDF were just loaded (Reactome not tested yet).
Some more complex queries have memory issues which we hope to solve with a triple store update in a few weeks.