Hi,
I would like to retrieve all the interactions between ligands and target proteins from the KEGG BRITE database. Ideally, each entry will contain a protein name, a list of interacting ligands, its FASTA sequence and an sdf or mol2 coordinates of the ligand, in a dataframe. I saw that there is an API for the KEGG DB and other python packages that can be used for data retrieval like bioservices and biopython. However, from the basic tutorials that I saw, there was no example on how I can retrieve the entire database without searching by specific keyword (ligand name, protein name...etc).
Do you have any 'advanced' tutorial to recommend, on how to navigate through the KEGG DB and extract different type of informations (since I will need the FASTA sequences, ligand in sdf or mol2 format...)? Also, any technical help will be great! :)
The KEGG FTP site might be a good starting point. https://www.kegg.jp/kegg/download/
Hey! thank you for the suggestion. If I got it right, I will have to subscribe before being able to download data from KEGG. (I did it, now waiting for a reply). In the meantime, I think I found what I need here enter link description here - this allows me to download an htext file with information about protein-target and the corresponding drug. However, I don't fully understand the structure of the file. For example the CHRM entry here
has 5 genes products (HSA:1128 1129 1131 1132 1133) which are either all concatenated together (CHRM) or, separated within each entry (CHRM1-CHRM4). However, not all the corresponding interacting drugs are mentioned, if compared to the information in this page: enter link description here, or not mentioned at all as in the case of CHRM5. Is this because I need a subscription in order to have the complete data? Do you have an idea about this?
Many thanks in advance.
I'm not familiar with the KEGG interactions database or the format of the information. It did look like signing up was required to access the FTP site - maybe check your spam/junk folder for their follow up email? Else, it looks like there is an API that would allow you to programmatically query the KEGG Brite db https://www.kegg.jp/kegg/rest/keggapi.html which may be a good alternative option for getting the data you are looking for.