Inconsistencies in Using biomaRt to Retrieve HGNC Names from GO Terms
1
0
Entering edit mode
7.2 years ago
JMallory • 0

In returning to a project after some time, I noticed that a simple biomaRt script I had written to return all HGNC gene names associated with a list of GO terms provided lists of genes that were often inconsistent with the list obtained by searching directly for the term in the amiGO2 browser.

A representative example of my code:

library(biomaRt)

##provide single GO term as toy example
go_term<-c("GO:0005543")

## query genes associated with GO term
ensembl<-useMart("ensembl")

ensembl <- useDataset("hsapiens_gene_ensembl", mart=ensembl)

geneList<-getBM(attributes= "hgnc_symbol",
                filters=c("go"), 
                values=go_term, 
                mart=ensembl)

I would like to better understand the reason for this inconsistency and know if I should revisit this aspect of my project.

biomaRt GO Terms • 2.3k views
ADD COMMENT
0
Entering edit mode

Most likely you're now using a different version of Ensembl than the one you were using the first time.

ADD REPLY
0
Entering edit mode

That is possible. But why, at this moment, can I search this GO ID in the amiGO2 browser and find 375 associated gene products while scripting this search with the provided code yields 85 genes? I think maybe I don't understand the role of the Ensembl mart object here.

ADD REPLY
6
Entering edit mode
7.2 years ago
Ben Moore ★ 2.4k

Hi JMallory,

The BiomaRt query you have used here will retrieve all genes linked specifically with that GO term.

The amiGO2 browser is returning all genes associated with that GO term AND its daughter terms.

You can retrieve the list of genes associated with the GO term AND its daughter terms in Ensembl BiomaRt using the "go_parent_term" filter:

<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
    <Filter name = "go_parent_term" value = "GO:0005543"/>
    <Attribute name = "ensembl_gene_id" />
    <Attribute name = "ensembl_transcript_id" />
</Dataset>

In the web-interface, this filter is found in the 'GENE ONTOLOGY' filter sub-menu. The query you performed originally is equivalent to using the GO term as a filter in the 'Input External References ID list' filter in the 'GENE' filter sub-menu.

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT
2
Entering edit mode

For ease of copy/paste, the R code you would use is:

geneList <- getBM(attributes = "hgnc_symbol",
                  filters = "go_parent_term", 
                  values = go_term, 
                  mart = ensembl)
ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6