This seems to be a difficult question to get a clean answer to. We need it as background to our binding domain mapping project http://synpharm.guidetopharmacology.org/
This PDBe query
Indicates 9458 compounds for 35340 human proteins but there are many issues with the PDB ID mappings to 5860 human Swiss-Prots, including non-human and false positives (i.e. no bound compound)
In addition the compound download sheet is unparsable, has many gaps, and includes many hetero atoms, including inorganics, without binding pockets
Has anyone tried to do this? some kind of pocket filter might help to filter the ~ 8-10K small-molecule true binders from the 33853 hetero entries