Entering edit mode
22 months ago
4galaxy77
2.9k
I have annotated some gene names using biomaRt
to return the field go_id
for each gene, which annotates it with multiple go_ids per gene, e.g.:
ensembl_gene_id go_id
1: ENSG00000261657 GO:0006810
2: ENSG00000261657 GO:0005739
3: ENSG00000261657 GO:0005634
4: ENSG00000261657 GO:0016021
5: ENSG00000261657
6: ENSG00000144741 GO:0006810
7: ENSG00000144741 GO:0005739
8: ENSG00000144741 GO:0005634
9: ENSG00000144741 GO:0016021
10: ENSG00000144741 GO:1901962
11: ENSG00000144741 GO:0015805
12: ENSG00000144741 GO:0000095
13: ENSG00000144741 GO:0005743
14: ENSG00000144741
If I choose one of these go_ids, e.g. GO:0005743
, and look up the GO term hierarchy, then it shows this hierarchy.
I am most interested in getting the highest level term like cellular_component, from each go_id.
How can I do this in R?
There's nothing like "go_domain", can find
[1] "go_id" "go_linkage_type" "goslim_goa_accession" "goslim_goa_description"
Try something like
searchAttributes(mart, pattern='domain')
(I don't have biomaRt at hand just now)