This question is similar to the one I asked here: How to download all Gene Ontology (GO) IDs with their associated vocabulary?
I am looking to create a file with two columns. The first column should be KEGG IDs. The second should be the associated vocabulary of each KEGG ID. I want to do this for every KEGG ID of the organism I'm working with, Chlamydomonas reinhardtii.
Someone has already asked a similar question on the Bioconductor forum: https://support.bioconductor.org/p/109871/
However, I do not believe the answers here are suitable for my question. This is because the answers there were tailored for human data, which on Bioconductor has a special org.Hs.eg.db package that can be downloaded, and the answers basically worked with this annotation package. Unfortunately, I do not see that there is a similar annotation package for Chlamydomonas reinhardtii.
You will more than likely need to use KEGG API for this. I am not sure what exactly you are looking for.
That looks like a very useful resource, but I'm really unsure of how to use it. Looking around at the website though, I see specific pages like this one:
https://www.kegg.jp/entry/K14855
I do have a list of 'K numbers' (I think that's what you call terms like 'K14855') and, looking at the layout of the website, what I would like to do is 1) input each of my K numbers 2) for each one, return back the pathways it is associated with.
I've mostly written a Python script to extract the relevant content from each url directly, but the script takes a while to run because each url request has a couple seconds of delay, and sometimes it just crashes with an HTTPError ... so if you can explain how I can use this API to do what I listed above in this response, that would be great.
Downloading large amount of data from KEGG requires a subscription and would be a violation of their acceptable use policy.
Oh I see. I guess I will have to find another way to translate each K number into a set of corresponding pathway names then.
One more option may be to use UniProt proteomes database which includes KEGG ID's.
You can find the proteome page here: https://www.uniprot.org/uniprotkb?query=proteome:UP000006906 (click on the
customize columns
link and include KEGG ID). Once you have the columns useDownload
button to get a table.