R: download all KEGG pathways including KO and Compounds
2
0
Entering edit mode
4.5 years ago
dago ★ 2.8k

I saw this question has been asked here and there before. However, I could not find a tool that does the job for me.

I want to download all pathways from KEGG including KO and compounds using R. I would imagine creating an R object like:

$Path_1
...KO
...Compounds
$Path_2
...KO
...Compounds
$Path_3
...KO
...Compounds

Any idea how to download the data?

Thank you

R KEGG System_biology • 5.1k views
ADD COMMENT
1
Entering edit mode

all pathways from KEGG including KO and compounds using R.

That would violate their AUP if you don't have a license.

ADD REPLY
0
Entering edit mode

I did not think about this. I guess I an getting used to have open source tools/db. Thanks

ADD REPLY
2
Entering edit mode
4.5 years ago
5heikki 11k

You can use their API. However, it is not meant for downloading the entire database. For that there is the ftp which is behind a license

ADD COMMENT
0
Entering edit mode

ah right, that is maybe why I could not find any tool doing that!

ADD REPLY
2
Entering edit mode
4.5 years ago
ATpoint 86k

MSigDB contains the KEGG pathways: https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp Download the gmt file and then load it into R, e.g. with

kegg <- fgsea::gmtPathways("c2.cp.kegg.v7.1.symbols.gmt")

> head(kegg)
$KEGG_GLYCOLYSIS_GLUCONEOGENESIS
 [1] "ACSS2"   "GCK"     "PGK2"    "PGK1"    "PDHB"    "PDHA1"   "PDHA2"   "PGM2"   
 [9] "TPI1"    "ACSS1"   "FBP1"    "ADH1B"   "HK2"     "ADH1C"   "HK1"     "HK3"    
[17] "ADH4"    "PGAM2"   "ADH5"    "PGAM1"   "ADH1A"   "ALDOC"   "ALDH7A1" "LDHAL6B"
[25] "PKLR"    "LDHAL6A" "ENO1"    "PKM"     "PFKP"    "BPGM"    "PCK2"    "PCK1"   
[33] "ALDH1B1" "ALDH2"   "ALDH3A1" "AKR1A1"  "FBP2"    "PFKM"    "PFKL"    "LDHC"   
[41] "GAPDH"   "ENO3"    "ENO2"    "PGAM4"   "ADH7"    "ADH6"    "LDHB"    "ALDH1A3"
[49] "ALDH3B1" "ALDH3B2" "ALDH9A1" "ALDH3A2" "GALM"    "ALDOA"   "DLD"     "DLAT"   
[57] "ALDOB"   "G6PC2"   "LDHA"    "G6PC"    "PGM1"    "GPI"    

$KEGG_CITRATE_CYCLE_TCA_CYCLE
 [1] "IDH3B"    "DLST"     "PCK2"     "CS"       "PDHB"     "PCK1"     "PDHA1"   
 [8] "PDHA2"    "SUCLG2P2" "FH"       "SDHD"     "OGDH"     "SDHB"     "IDH3A"   
[15] "SDHC"     "IDH2"     "IDH1"     "ACO1"     "ACLY"     "MDH2"     "DLD"     
[22] "MDH1"     "DLAT"     "OGDHL"    "PC"       "SDHA"     "SUCLG1"   "SUCLA2"  
[29] "SUCLG2"   "IDH3G"    "ACO2"    

$KEGG_PENTOSE_PHOSPHATE_PATHWAY
 [1] "RPE"     "RPIA"    "PGM2"    "PGLS"    "PRPS2"   "FBP2"    "PFKM"    "PFKL"   
 [9] "TALDO1"  "TKT"     "FBP1"    "TKTL2"   "PGD"     "RBKS"    "ALDOA"   "ALDOC"  
[17] "ALDOB"   "H6PD"    "RPEL1"   "PRPS1L1" "PRPS1"   "DERA"    "G6PD"    "PGM1"   
[25] "TKTL1"   "PFKP"    "GPI"
ADD COMMENT
0
Entering edit mode

That is actually great! But I am not sure there are compounds here, just name of genes. No?

ADD REPLY
0
Entering edit mode

not the best solution because is for single organisms, but genome scale metabolic models (http://bigg.ucsd.edu/data_access) have all the information you need regarding the Gene-Protein-Reaction associations. Once you have the gene id, getting the KO with eggNOG shoudl not be a problem.

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6