How to obtain highest level GO term from go_id?
1
0
Entering edit mode
22 months ago
4galaxy77 2.9k

I have annotated some gene names using biomaRt to return the field go_id for each gene, which annotates it with multiple go_ids per gene, e.g.:

   ensembl_gene_id      go_id
 1: ENSG00000261657 GO:0006810
 2: ENSG00000261657 GO:0005739
 3: ENSG00000261657 GO:0005634
 4: ENSG00000261657 GO:0016021
 5: ENSG00000261657
 6: ENSG00000144741 GO:0006810
 7: ENSG00000144741 GO:0005739
 8: ENSG00000144741 GO:0005634
 9: ENSG00000144741 GO:0016021
10: ENSG00000144741 GO:1901962
11: ENSG00000144741 GO:0015805
12: ENSG00000144741 GO:0000095
13: ENSG00000144741 GO:0005743
14: ENSG00000144741

If I choose one of these go_ids, e.g. GO:0005743, and look up the GO term hierarchy, then it shows this hierarchy.

enter image description here

I am most interested in getting the highest level term like cellular_component, from each go_id.

How can I do this in R?

GO-term R • 898 views
ADD COMMENT
0
Entering edit mode
22 months ago

biomaRt should have that information already. Have a look at the output of listAttributes(mart) and search for something like "go_domain".

ADD COMMENT
0
Entering edit mode

There's nothing like "go_domain", can find [1] "go_id" "go_linkage_type" "goslim_goa_accession" "goslim_goa_description"

ADD REPLY
0
Entering edit mode

Try something like searchAttributes(mart, pattern='domain') (I don't have biomaRt at hand just now)

ADD REPLY

Login before adding your answer.

Traffic: 2131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6