In returning to a project after some time, I noticed that a simple biomaRt script I had written to return all HGNC gene names associated with a list of GO terms provided lists of genes that were often inconsistent with the list obtained by searching directly for the term in the amiGO2 browser.
A representative example of my code:
library(biomaRt)
##provide single GO term as toy example
go_term<-c("GO:0005543")
## query genes associated with GO term
ensembl<-useMart("ensembl")
ensembl <- useDataset("hsapiens_gene_ensembl", mart=ensembl)
geneList<-getBM(attributes= "hgnc_symbol",
filters=c("go"),
values=go_term,
mart=ensembl)
I would like to better understand the reason for this inconsistency and know if I should revisit this aspect of my project.
Most likely you're now using a different version of Ensembl than the one you were using the first time.
That is possible. But why, at this moment, can I search this GO ID in the amiGO2 browser and find 375 associated gene products while scripting this search with the provided code yields 85 genes? I think maybe I don't understand the role of the Ensembl mart object here.