all unique compound structure (SMILES) of PubChem Database
2
1
Entering edit mode
9.3 years ago
ajingnk ▴ 130

Hi everyone,

I want to get all unique compound structures of PubChem Database. I have download SDF file for PubChem, but it is 45G after gzip. If I convert all SDF file to SMILES, that won't be easy... Is there any way to retrieve all SMILES for the whole PubChem?

Thanks,
Jing

compound pubchem • 5.2k views
ADD COMMENT
1
Entering edit mode
9.3 years ago
ajingnk ▴ 130

In case someone also needs it, PubChem has InCHI data for all compound.

The FTP InCHI data can be downloaded from the following FTP directory: ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/Extras/CID-InChI-Key.gz

ADD COMMENT
0
Entering edit mode
3.8 years ago
gbrault • 0

109,485,000 smiles in this file 2/17/2021

ADD COMMENT

Login before adding your answer.

Traffic: 2230 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6