Say I'm interested in the molecular function of Ywhaz. On AmiGO, which seems to be the basic web-tool immediately accessible from the GO website, I get 3 associations about protein binding.
On Biomart I get 4 results, the extra one being a "transcription factor binding" (which has caused me some trouble).
I was wondering:
What are the differences between these two services?
In general, whether there was a canonical source I should consult, or if I should become a lot more careful source material when querying online services
I use the Ensemble Genes 58 (SANGER UK) database, selecting mouse genes. My Amigo query is mosue too. I have no rational way to pick between databases, so I use Sanger because I met a girl once who worked there. She was nice.
Which Biomart server and database did you query? When I query biomart.org using MGI as the database, I get exactly the same three molecular function terms as you get on AmiGO, which is just how it should be since I can see that AmiGO used MGI as the source of annotations.
There are frequently differences of this kind between the different species annotations; I typically find the Human has more annotations than the mouse.
Thanks! Though I'm aware of the differences between mouse and human annotations in general, this is bound to be the source of the discrepancy, no? This response doesn't answer the question about the differences between Amigo and Biomart (which, seemingly, might actually be about the Sanger DB and whatever amiGO points at), though, so I'll wait to see if any other answers are forthcoming before ticking...
The EnsEMBL annotation for Ywhaz (ENSMUSG00000022285) has 3 transcripts annotated to GO:0008134 (transcription factor binding) with evidence code IEA. The MGI annotation does not contain this. As the EnsEMBL gene has been merged with a Vega manually-annotated gene, it is possible that the extra term came from there... No, that hunch was wrong; the Vega versions have fewer GO terms annotated which suggests to me that the extra one is from the automatic gene build.
I use the Ensemble Genes 58 (SANGER UK) database, selecting mouse genes. My Amigo query is mosue too. I have no rational way to pick between databases, so I use Sanger because I met a girl once who worked there. She was nice.
Which Biomart server and database did you query? When I query biomart.org using MGI as the database, I get exactly the same three molecular function terms as you get on AmiGO, which is just how it should be since I can see that AmiGO used MGI as the source of annotations.
For mouse genes I would recommend using MGI - it is the model organism database for mouse and thus presumably has the best annotation of mouse genes.
I think Lars is right. I would choose the data source where the provenance of the annotation is most clearly indicated.