Simple to do with the Cactvs Cheminformatics Toolkit (version 3.428 and later, www,xemistry.com/academic ):
Tcl single-line script:
molfile write RP00079.rxn [reaction create KEGG:RP00079]
Python single-line script:
Molfile.Write('RP00079.rxn',Reaction('KEGG:RP00079'))
This writes out a standard MDL RXN file with full atom mapping. Structures and reaction/mapping data are extracted from the (dynamically downloaded) KEGG KCF data, not the Molfiles. Of course it is possible to do much more with the reaction data than just dumping it into a file with a few more script commands.
My experience is that it is highly advisable to use whenever possible the original data formats of sites - KCF for KEGG, ASN.1 for PubChem etc. Use tools which allow you to work with the original data. Any remote format conversion brings the risk of data loss and transformation weirdness which you cannot control, such as the atom scrambling in KEGG Molfile output.
$RXN
C00036_C00149
WI pycactvs 101920141408RP00079
1 1
$MOL
Oxaloacetate
WIpycactvs10191414082D 0 0.00000 0.00000C00036
13 12 0 0 0 0 0 0 0 0999 V2000
21.5736 -15.6638 0.0000 C 0 0 0 0 0 0 0 0 0 1 0 0
22.9771 -15.6638 0.0000 C 0 0 0 0 0 0 0 0 0 2 0 0
20.6530 -16.7130 0.0000 C 0 0 0 0 0 0 0 0 0 3 0 0
21.0529 -14.4776 0.0000 O 0 0 0 0 0 0 0 0 0 4 0 0
23.6787 -16.8742 0.0000 C 0 0 0 0 0 0 0 0 0 5 0 0
19.2817 -16.4364 0.0000 O 0 0 0 0 0 0 0 0 0 6 0 0
21.0402 -18.0393 0.0000 O 0 0 0 0 0 0 0 0 0 7 0 0
22.9706 -18.0843 0.0000 O 0 0 0 0 0 0 0 0 0 8 0 0
25.0057 -16.7976 0.0000 O 0 0 0 0 0 0 0 0 0 9 0 0
23.7783 -15.3712 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
22.8285 -14.8239 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
19.0097 -15.6280 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
22.1177 -18.0793 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 3 1 0 0 0 0
1 4 2 0 0 0 0
2 5 1 0 0 0 0
3 6 1 0 0 0 0
3 7 2 0 0 0 0
5 8 1 0 0 0 0
5 9 2 0 0 0 0
2 10 1 0 0 0 0
2 11 1 0 0 0 0
6 12 1 0 0 0 0
8 13 1 0 0 0 0
A 001
C5a
A 002
C1b
A 003
C6a
A 004
O5a
A 005
C6a
A 006
O6a
A 007
O6a
A 008
O6a
A 009
O6a
M END
$MOL
(S)-Malate
WIpycactvs10191414082D 0 0.00000 0.00000C00149
15 14 0 0 1 0 0 0 0 0999 V2000
25.0942 -18.6205 0.0000 C 0 0 1 0 0 0 0 0 0 1 0 0
26.2953 -19.3132 0.0000 C 0 0 0 0 0 0 0 0 0 2 0 0
23.8998 -19.3132 0.0000 C 0 0 0 0 0 0 0 0 0 3 0 0
25.0942 -17.2355 0.0000 O 0 0 0 0 0 0 0 0 0 4 0 0
27.4897 -18.6205 0.0000 C 0 0 0 0 0 0 0 0 0 5 0 0
22.7052 -18.6143 0.0000 O 0 0 0 0 0 0 0 0 0 7 0 0
23.8360 -20.8303 0.0000 O 0 0 0 0 0 0 0 0 0 6 0 0
28.6842 -19.3132 0.0000 O 0 0 0 0 0 0 0 0 0 8 0 0
27.4834 -17.2355 0.0000 O 0 0 0 0 0 0 0 0 0 9 0 0
24.3420 -18.1872 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
25.7379 -19.9787 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
26.8543 -19.9774 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
25.8460 -16.8015 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
22.7102 -17.7462 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
28.6825 -20.1813 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 3 1 0 0 0 0
1 4 1 1 0 0 0
2 5 1 0 0 0 0
3 6 1 0 0 0 0
3 7 2 0 0 0 0
5 8 1 0 0 0 0
5 9 2 0 0 0 0
1 10 1 0 0 0 0
2 11 1 0 0 0 0
2 12 1 0 0 0 0
4 13 1 0 0 0 0
6 14 1 0 0 0 0
8 15 1 0 0 0 0
A 001
C1c
A 002
C1b
A 003
C6a
A 004
O1a
A 005
C6a
A 006
O6a
A 007
O6a
A 008
O6a
A 009
O6a
M END
Hi, I am running into the same problem.
It looks like every RPAIR is referenced to the left hand side molecule... with total disregard to mol file numbering.
Have you figured out a solution?
Thanks
The same problem here. Seems I have to map chemical structures used in RPAIRS to COMPOUND structures manually. I hope it won't be that hard, at least for carbon atoms.