How to get a mapping between KEGG module and KEGG orthologs?
1
0
Entering edit mode
3.7 years ago
O.rka ▴ 740

I'm looking for the most simple of tables but it's difficult to find. A table of KEGG orthologs in the following format: [MODULE]\t[KO_1, KO_2, ..., KO_N]

I downloaded a weird formatted flat file from KEGG but for some of the KEGG modules there were other KEGG modules in the hierarchy (yes, I know KEGG is hierarchical) such as https://www.genome.jp/kegg-bin/show_module?M00615

Does anyone know where I can find this? I just need a very simple table for set comprehension.

KEGG Annotation Ortholog Module • 2.5k views
ADD COMMENT
4
Entering edit mode
3.7 years ago
Elucidata ▴ 270

You can use R to construct the table. You would need to install the package and load it. You can use the code below to connect to the KEGG database, retrieve module information, map to get corresponding ortholog information, and construct the table.

#Install package to get relevant information from KEGG database

if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("KEGGREST")

#Note: Uncomment the above code once the installation is successful

#Load package
library(KEGGREST)

#Get list of modules in KEGG
mod <- keggList("module")

#Loop through each module and get corresponding orthologs
#Return module ID, corresponding list of orthologs
#obj is a list of list(moduleID,orthologs)
#[[moduleID1,orthologs list 1],[moduleID2,orthologs list 2].etc]

obj<-lapply(names(mod),function(x)
{
  module<-strsplit(x,"md:")[[1]][2]
  #Search for corresponding ortholog
  ko<-keggGet(x)
  #Save list of orthologs as a string separated by ","
  orthologs<-paste(names(ko[[1]]$ORTHOLOGY),collapse = ",")
  list(module,orthologs)
})

#Convert list to dataframe
df<-do.call(rbind,obj)

#Name columns
colnames(df)<-c("Module","KO")

#Display first few entries in the table
head(df)

#Save table to csv file
write.csv(df,path to file/filename.csv)
ADD COMMENT
0
Entering edit mode

This is amazing! Thank you so much. It pretty much works like a charm. However, I noticed a few weird parts. Do you know why some of the KO descriptions are in there? For example, 11 M00011 K00164,K00658,K00382,K00174,K00175,K00177,K00176,K01902,K01903,K01899,K01900,K18118,K00234,K00235,K00236,K00237,K00239,K00240,K00241,K00242,K18859,K18860,K00244,K00245,K00246,K00247 fumarate reductase [EC:1.3.5.4] [RN:R02164],K01676,K01679,K01677+K01678,K00026,K00025,K00024,K00116. Also, is it possible to output what "version" of KEGG this for when I store the file? That could be useful for accessing this in the future.

ADD REPLY
0
Entering edit mode

Also, using this method there are only 443 modules. What happened to other modules such as "M00080"? I'm not seeing these on the KEGG website but seeing them in previous publications.

ADD REPLY

Login before adding your answer.

Traffic: 2046 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6