Hi,
How can I get all the Ligands
from PDB
with the X, Y, Z coordinates
of the atoms and the chain and residue IDs?
Thanks
Hi,
How can I get all the Ligands
from PDB
with the X, Y, Z coordinates
of the atoms and the chain and residue IDs?
Thanks
Option 1: As of today, there are 10788 ligands in PDB Use the list of links via Advanced search. Using this form you can download data in Structure Data file format. Click on Display / Download to download the data.
Option 2: If you need extract XYZ from indivdual PDB files directly from file. Use the list from the above link, get the list of PDB IDs from display ID. Use this list and download full coordinate files using the Advanced search interface.
Choose a Query type: PDB IDs Paste the ids and Click on Display / Download to download the data. Once you download the PDB files.
You can use a small perl / shell / grep script to extract ligand information.
Here is an example download and extract ligand information from a PDB file (PDB ID: 1ASH)
Ligand information is provided with HETATM header and you can easily grep in the following format.
grep "^HETATM" 1ASH.pdb > 1ASH_ligand.pdb
Depending on what you need, there is also a standard reference describing all the small molecules and residues in the PDB including idealized coordinates. It is called the "chemical component dictionary": http://www.wwpdb.org/ccd.html
And here is an answer which retrieves exactly what was asked for, with a single tool, and without additional pre- or post-processing.
Problems like these are a typical application for a cheminformatics scripting tool, like our Cactvs toolkit. There are free academic downloads at http://www.xemistry.com/academic.
Using the 1ASH.pdb sample file cited above, a minimal script in the Tcl interface language looks like
set eh [molfile read http://www.pdb.org/pdb/files/1ASH.pdb]
filter create hetatm property A_RESIDUE(hetatm) value 1 operator =
foreach coords [ens get $eh A_XYZ hetatm] \
chain [ens get $eh A_RESIDUE(chain) hetatm] \
resid [ens get $eh A_RESIDUE(resid) hetatm] {
puts [format "%.2f\t%.2f\t%.2f\t%s\t%s" {*}$coords $chain $resid]
}
and in Python (sponsored by Vertex Inc.)
eh=Molfile.Read('http://www.pdb.org/pdb/files/1ASH.pdb')
f=Filter('hetatm',{'property':'A_RESIDUE(hetatm)','value':True,'operator':'='})
for (coords,residue) in zip(eh.get('A_XYZ',filters=f),eh.get('A_RESIDUE',filters=f)):
print("{:.2f}\t{:.2f}\t{:.2f}\t{}\t{}".format(coords[0],coords[1],coords[2],residue.chain, residue.resid))
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
A script is available which automatizes it for you based on the binding MOAD database (is a database where structures and ligands have been pre-filtered according to some severe standards -- resolution, validation of the ligand, ecc..). You can find it here: https://github.com/lucagl/MOAD_ligandFinder