Our database team (http://www.guidetopharmacology.org/) would like to set up some kind of allerting triage for the following criteria for human proteins a) novel ligands in an old protein b) old ligands in different proteins b) a protein that gets any ligand for the first time. There are some additonal filters such chemical property and pocket locations indiciative of specifically bound, lead-like, small-ish, moleculular structure (i.e.pharmcologically-relevant ligands rather than just new hetero atom enties). We'd also want a frequent-hitters stop list. For example, we're not that interested in seeing yet more ligands from popular kinases, BACE1, DPPIV, FX or thrombin (exept for clinical cpds of course but we can select these anyway).
None of the three WWPDB portals seem to allow such a query off the bat. In addition we would prefer a transatlantic split in the form of Swiss-Prot IDs for the proteins but PubChem CIDs for the ligands (i.e. rather than gi numbers or UniChem IDs but we could cope with these). For the record we do check MMDB for novel CIDs and skim the PDBe weekly new entry lists but a specific alert is obviously more efficient. Might a SPARQL query or a KNIME node be an option?
This is still a gap for us that we can't really fill with a KNIME node or local PDB copy just now. Anyone know any other ways forward (BioMart?)