I am developing an educational Bioinformatics framework, I need to know
- Which atoms are absolutely not found in any PDB files?
in the following list -
I am developing an educational Bioinformatics framework, I need to know
in the following list -
So far from my limited searching I haven't found a list that already compiles this information However, the data where you could mine the elements that are found in PDB files is available and updated weekly.
Proteopedia's 'Ligand' page is a good jumping off point for what I found. It has a lot of links about the non-standard resides and heteroatoms found in PDB file entries. Importantly, it links to the Chemical Component Dictionary that is updated weekly.
"This dictionary contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands, and solvent molecules"
The full dictionary is available there and it each entry has a line _chem_comp.formula
that could be parsed to collect all the represented elements. From there you could use Python's sets mat to find those elements not represented. Of course, you could do similar with your favorite computational language.
The dictionary and history are further described the Chemical Component Dictionary page of the Worldwide Protein Data Bank. There's even an associated publication by Westbrook et al., 2014.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.