Entering edit mode
2.6 years ago
arinjoy
•
0
Hello, how do I retrieve the active/catalytic site information from PDB or Uniprot using python ? I need the residues that are present in the enzyme's active site as well as its residue number in the peptide sequence. Is there a way to extract this information.
What have you tried?
Are you only looking for those annotated as active/catalytic in the PDB header section by the depositors / in the course of remediation by the PDB maintainers? For example the header of 1pop lists CYS25 and HIS159 as 'CAT' sites.
Only a specific class of enzyme? Because often there are specialized databases by experts in the field with more information than PDB or Uniprot alone have.
Looked at here?
Do you want predict ones for a class of enzyme based on relative distance and arrangement constraints of consensus residues? I've seen Python taught to structural biologists using PDB files to mine those structures that match the Cys-His-Asp catalytic triad. (I updated the end of my reply here with a step-by-step example using a related script for catalytic-triad-detection to scan PDB files in a directory.)
Tried PDBSiteScan?
Tried CSmetaPred?
Tried MEDscore?
Tried 'Functional residue identification' via DeepFRI?
If it is possible to get the active site information through the headers then it would be best. But as I saw, not all pdb files have the 'CAT' sites mentioned. That way it wouldn't it be difficult to write a generalized code to parse the info. From the residue number of the active sites I would be needing to obtain their coordinates for further analysis.
You need to check the pdb header. You can extract the ligands from pdb file if it has any and look at the residues around 3-5A of the ligand using pymol scripts in python.
There's a lot of things classified as 'ligands' in structures that aren't located at what is necessarily the active or catalytic site, @Pappu. @arinjoy, what @Pappu proses is certainly an option. However, there's lot of options for exploring residues in contact with 'ligands' aside of scripting in PyMOL. Or scripting anything to get that information at all. There's several sites derived from data deposited at the PDB that already offer various forms of this information. This is why I encouraged the OP to flesh out the request much more.
Yes the question is not clear. So I gave an opinion.