When using ensembl, database entries are linked to GO terms at the transcript level. According to the documentation in the Perl api, the GO terms listed are those associated with the Swiss Prot entry for the transcript. These GO terms are then linked directly to the transcript as xrefs via the appropriate tables in the database (object-xref and ontology_xref). However, as far as I can see, there is no method (using the database directly) of determining the category of a GO term(biological process/cellular component/ molecular function). There do not seem to be any fields in the database schema that capture this information
Please can you advise if there is a method of determining the GO term category.
EDIT:
In response to the second answer, this code resides in the perl api for retrieving ontology terms
SELECT DISTINCT
term.term_id,
term.accession,
term.name,
term.definition,
term.subsets,
ontology.namespace
FROM ontology
JOIN term USING (ontology_id)
JOIN synonym USING (term_id)
WHERE ( term.name LIKE ? OR synonym.name LIKE ? ));
but where are these tables (term/ontology/synonym). They are not in the main databases and they are not in the ontology_mart database either on the biomart server
If you're using the API, you need to set up an ontology adaptor:
e.g.
my $goadaptor = $registry->get_adaptor('Multi', 'Ontology', 'GOTerm');
You then need to pull the dblinks for your transcript object
i.e.
my $trandblinks = $transcript->get_all_DBLinks;
my @trandblinks = @{$trandblinks};
while (my $trandblink = shift @trandblinks) {
my $tranprimid = $trandblink->primary_id;
my $trandispid = $trandblink->display_id;
my $trandb = $trandblink->dbname;
Then use the ontology adaptor to get the required details...
if ($trandb eq "GO") {
my $term = $goadaptor->fetch_by_accession($trandispid);
if ($term->namespace) {
my $process = $term->namespace;
BioMart has updated for v62. I would like to point out the new 'GO domain' attribute which will tell you which of the three categories the GO term fits into (i.e. biological process, cellular component or molecular function).
Thanks for your reply. Do you know where this information is stored in the underlying database? I prefer to work with the db directly if possible.