Hi, community!!! I have downloaded a list of genes from the MetaCyc database for some bacterial species. I want to find out their respective KO IDs. Can anyone please tell me how can I do that?
Thanks
Hi, community!!! I have downloaded a list of genes from the MetaCyc database for some bacterial species. I want to find out their respective KO IDs. Can anyone please tell me how can I do that?
Thanks
There are a lot of ways, but the easiest I have found is to use UniProt to match the genes to KO (or GO) ids. This is assuming your organism(s) are on uniprot and categorised.
Programmatically, you can use R package KEGGREST
or Bio.KEGG
in python.
you can use uniprot https://www.uniprot.org/uploadlists/ Retrieve/ID mapping to get some KO columns. you can also use pir database https://proteininformationresource.org/
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks Mark!!! I have a large list. Can I retrieve the KO IDs as batch?
Yes, but the KEGG api limits how many you can send in a batch. I can't remember the limit but I think it was in the hundreds