We have a dataset consisting of 580-some proteins and are trying to condense it down by functional group. As such we want to go from the existing Protein ID (e.g. P02647) to an EC number. I can't seem to find a reliable way of doing this, let alone doing it batch-wise. Any suggestions would be appreciated!
There might be a better solution, but you can find their database here: ftp://ftp.expasy.org/databases/enzyme Probably parsing this will get you what you want.
We did download that database already and are working on that - thanks for the suggestion! But what we are seeing is that many of the PID's don't seem to have an equivalent EC - most of them, actually. P02647 is a perfect example. So clearly we're missing something!
Please use
ADD REPLY/ADD COMMENT
when responding to existing posts to keep threads logically organized.Would you expect an EC number for every protein? I certainly wouldn't... isn't it just for enzymes?
Disclaimer: far from my common comfort zone, so might be talking nonsense. Only enzymes I care about are those working on DNA/RNA.
So not an enzyme itself.