I'm using the biomaRt package for R in order to get all the GO terms associated with with a particular Gene ID (Entrez ID). The script is quite simple and similar to the tutorial example:
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")
goids = getBM(attributes=c('entrezgene','go_id') ,filters='entrezgene', values=geneList, mart=ensembl)
This gives me a list of GO terms for each gene in geneList
.
In order to check these results I queried the GO dataset using GOOSE. So I translated Entrez ID into Uniprot ID and I executed a simple basic query (SQL Query by Uniprot ID).
As result, usually the list obtained with biomaRt is longer than the one obtained using GOOSE (which directly query to GO dataset). How is this possible? Which method should I trust more? I think that GOOSE is more reliable, but why biomaRt list more GO terms?
Can you please give an example?