I was wondering if anybody knows of annotating protein sequences with very general terms (in a similar manner GOSlim terms) I am looking for something along the lines the Chembl classification, but perhaps more general as I want to classify both yeast and human proteins.
I would suggest looking at the COG functional categories for a set of broad functional categories into which you can classify proteins. If you are planning to compare yeast and human proteins in terms of functions, the eukaryotic orthologous groups from the eggNOG database may be of use to you. They cover both yeast and human proteins, and they are annotated with the COG functional categories.
Also, if you have only the sequence, and you presume the protein is not in Uniprot (or there is not any information of it in Uniprot) you can use the Interpro server at http://www.ebi.ac.uk/interpro/, which will try to find functional categories (GO terms) based on its sequence information in an "intelligent" manner (Interpro will try to find families this protein belongs to, search for specific patterns, etc).
Interpro is manually curated, and contains 11 different databases (prediction methods) which are specifically tuned to give you only high-precision results. So, it is unlikely that you get a very wrong functional assignment to the protein (even sometimes you get nothing as an output).
"[...] a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive crossâreferences and query interfaces [...]".
I think you miss the point of the question. Iain is asking for how to classify yeast and human proteins according to a set of very broad, general functional terms. I do not see how UniProt IDs would help him.
Thanks Lars, I think that is closest to what I am looking for.