linking ensembl gene ID to GO term?
2
3
Entering edit mode
10.5 years ago
user ▴ 950

Is there a table that can be downloaded from FTP or accessed programmatically that links Ensembl ID for a given genome (like 'hg18' or 'mm9') to their GO terms - ids of the form "GO:..."? Is there a UCSC table that does this? I did not see any such table in: http://hgdownload.cse.ucsc.edu/goldenPath/mm9/database/

ucsc gene-ontology go gene-ids ensembl • 14k views
ADD COMMENT
13
Entering edit mode
10.5 years ago
seidel 11k

You can create a table fairly easily using R and biomart. The code below makes a table from ensembl, which you could export or write to disk, and also puts the result in a list-like format, which is a convenient R data structure:

library(biomaRt)
# select mart and data set
bm <- useMart("ensembl")
bm <- useDataset("mmusculus_gene_ensembl", mart=bm)

# Get ensembl gene ids and GO terms
EG2GO <- getBM(mart=bm, attributes=c('ensembl_gene_id','external_gene_id','go_id'))

# examine result
head(EG2GO,15)

# Remove blank entries
EG2GO <- EG2GO[EG2GO$go_id != '',]

# convert from table format to list format
geneID2GO <- by(EG2GO$go_id,
                EG2GO$ensembl_gene_id,
                function(x) as.character(x))

# examine result
head(geneID2GO)

# terms can be accessed using gene ids in various ways
> geneID2GO$ENSMUSG00000098488
[1] "GO:0009395" "GO:0008152" "GO:0005829" "GO:0030659" "GO:0004620"
[6] "GO:0004623" "GO:0046872" "GO:0005515"
> geneID2GO[['ENSMUSG00000098488']]
[1] "GO:0009395" "GO:0008152" "GO:0005829" "GO:0030659" "GO:0004620"
[6] "GO:0004623" "GO:0046872" "GO:0005515"
ADD COMMENT
3
Entering edit mode

Or, if you prefer, using pointy-clicky BioMart. See the help video here.

ADD REPLY
2
Entering edit mode

As I say every few months: the answer to almost every "how to convert ID X to ID Y" question is BioMart, or UCSC tables.

ADD REPLY
0
Entering edit mode

On BioMart, when you return a table with the GO accession number, each gene is only associated with a single GO term. Shouldn't there be many GO terms for most genes? Which one does BioMart choose?

ADD REPLY
0
Entering edit mode
10.5 years ago
Chris Fields ★ 2.2k

The UCSC table browser should have this, though it may require a little digging to get all the relevant info together (it's not exactly user friendly unless you understand SQL). I typically go with biomart myself, which may have the UCSC Ids as well.

ADD COMMENT

Login before adding your answer.

Traffic: 2467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6