I'm new to BioInformatics (I've got a Computer Science background) and I'm trying to define a measure of similarity between a pair of genes based on GO terms. I'd like to test it across several different genomic datasets. The problem is that each dataset uses different types of IDs for genes (i.e in an Ovarian Cancer dataset, IDs are like: MZ7.5306442, while in a Prostate Cancer dataset they are like: 1020sat). Is there a tool or a website that allows one to convert different gene IDs for the corresponding GO IDs?
Agreed. Although, in this case you might want to convert them all to an ID which is most useful for queries against GO. Maybe UniProt?
Good point Obi. It just occurred to me that if the data were coordinate based, a tool such as GREAT could be used. GREAT is very good at GO type questions for human, mouse and zebrafish. Working with genome coordinates cuts through the whole nomenclature problem.
I have several dataset to analyze, once I have the UniProt ID which tool could give me GO terms for each gene? I saw there are a lot of tools but I need a simple one that query GO and send me only GO terms
And introduces a build issue, perhaps.