Hello,
How to programmatically (Python/Java) find out the common parent between 2 GO IDs (gene ontology)? And How to calculate the distance between each GO ID and the common parent?
Thanks
Hello,
How to programmatically (Python/Java) find out the common parent between 2 GO IDs (gene ontology)? And How to calculate the distance between each GO ID and the common parent?
Thanks
If you would like an alternative method that doesn't require a relational database, here is a link to a java library we are using within the GO consortium:
http://code.google.com/p/owltools/
It wraps the OWL API and is intended to work over any obo format or owl ontology.
Here is an example for finding the least common ancestor between any two nodes in the GO graph:
import owltools.graph.*;
import owltools.sim.*;
import owltools.io.*;
OWLGraphWrapper g = getOntologyWrapper("file:go.obo");
// negative regulation of type B pancreatic cell apoptosis
OWLObject a = g.getOWLObjectByIdentifier("GO:2000675");
// caspase activator activity
OWLObject a = g.getOWLObjectByIdentifier("GO:0008656");
SimEngine se = new SimEngine(g);
for (OWLObject c : se.getLeastCommonSubsumers(a,b)) {
System.out.println(c);
// there may be different paths with different distances between {a,b}
// and c - just exhaustively list them all. if we like, we could
// average them, find the max, etc
for (OWLGraphEdge e : se.getEdgesBetween(a,c)) {
System.out.println(" Edge:" +e +" distance: "+e.getDistance());
}
for (OWLGraphEdge e : se.getEdgesBetween(b,c)) {
System.out.println(" Edge:" +e +" distance: "+e.getDistance());
}
}
For more, see:
http://code.google.com/p/owltools/wiki/FindingCommonAncestors
Here is my java solution:
import java.sql.*;
import java.util.*;
public class GoDistance
{
private Connection connection;
private GoDistance() {}
/** returns a map (go-id/distance) */
private Map<String,Integer> ancestors(String go) throws SQLException
{
Map<String,Integer> acn2distance=new HashMap<String,Integer>();
//put self
acn2distance.put(go,0);
PreparedStatement stmt=null;
ResultSet row=null;
try
{
//search all the parents
stmt=connection.prepareStatement(
"SELECT DISTINCT "+
" graph_path.distance,"+
" ancestor.acc "+
" FROM "+
" term "+
" INNER JOIN graph_path ON term.id=graph_path.term2_id) "+
" INNER JOIN term AS ancestor ON ancestor.id=graph_path.term1_id) "+
" WHERE term.acc=?"
);
stmt.setString(1,go);
row=stmt.executeQuery();
while(row.next())
{
int distance=row.getInt(1);
String acn=row.getString(2);
Integer prev=acn2distance.get(acn);
if(prev==null || prev>distance)
{
acn2distance.put(acn,distance);
}
}
return acn2distance;
}
finally
{
if(row!=null) row.close();
if(stmt!=null) stmt.close();
}
}
private void run(String go1,String go2) throws Exception
{
Class.forName("com.mysql.jdbc.Driver");
connection=DriverManager.getConnection(
"jdbc:mysql://mysql.ebi.ac.uk:4085/go_latest"+
"?user=go_select&password=amigo"
);
//get all parents of go1
Map<String,Integer> acn2dist1=ancestors(go1);
//get all parents of go2
Map<String,Integer> acn2dist2=ancestors(go2);
connection.close();
//common terms
Set<String> acns=new HashSet<String>(acn2dist1.keySet());
acns.retainAll(acn2dist2.keySet());
if(acns.isEmpty()) return;
//find the minimal distance to a common term
Integer bestDist=null;
String bestTerm=null;
for(String acn:acns)
{
int d= acn2dist1.get(acn)+acn2dist2.get(acn);
if(bestDist==null || bestDist>d)
{
bestDist=d;
bestTerm=acn;
}
}
if(bestDist==null) return;
//print result
System.out.println(bestTerm+"\t"+(acn2dist1.get(bestTerm)+acn2dist2.get(bestTerm)));
}
public static void main(String args[])
throws Exception
{
if(args.length!=2) return;
new GoDistance().run(args[0],args[1]);
}
}
Compilation:
$ javac -Xlint GoDistance.java
Execution:
$ java -cp path/to/mysql-connector.jar:. GoDistance "GO:0001578" "GO:0030036"
GO:0007010 3
I wanted to query GO with python, this page helped me a lot ! Thanks.
The mySQLdb module is needed, you can install it through aptitude (python-mysqldb)
import mySQLdb
go_db=MySQLdb.connect("mysql.ebi.ac.uk", port=4085, db="go_latest", user="go_select",passwd="amigo")
go_db_cursor=go_db.cursor()
query="..."
go_db_cursor.execute(query)
results=go_db_cursor.fetchall()
The MySQLdb user guide : http://mysql-python.sourceforge.net/MySQLdb.html GO request howtos : http://wiki.geneontology.org/index.php/Example_Queries
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks Pierre
(extra points for anyone who can do the "least" part directly in SQL using some fiendish aggregate operators...)
For fans of Hibernate, there is a Hibernate API for the GO mysql database. It would probably be over-the-top for this example anyway.
We also hope to have an alternative more modern database schema available for querying soon (with default DBMS of postgresql). This also has its own Hibernate layer. An announcement will be sent to gofriends when we have something ready for testing.
@Chris, I 'solved' the mysql problem a few years ago by creating an UDF function: http://plindenbaum.blogspot.com/2010/02/mysql-user-defined-function-udf-for.html
perfect. I was not sure about the query. thank you so much