I have two protein structures in PDB format. The two structures refer to two conformations of the same protein. I want to calculate the RMSD between them.
Is there Python module that will allow me to do this?
I have two protein structures in PDB format. The two structures refer to two conformations of the same protein. I want to calculate the RMSD between them.
Is there Python module that will allow me to do this?
You can use PyMOL to calculate alignment. To do this follow this steps:
--
*where pdbId1 and pdbId2 is identifier of your protein in PDB database.
Hi j.cossio.diaz
I struggled with the same problem and i find two module which can solve structural protein alignment in python.
Please take a look to the link: https://github.com/charnley/rmsd
You can explore the algorithm by using examples, which are exist in the zip file. The only think you have to do is extract the "ATOMS" with x,y,z coordinates from PDB files.
There is also a tool call biopython, which have a Bio.PDB module to align PDB files. here is the link: http://combichem.blogspot.com.tr/2013/08/aligning-pdb-structures-with-biopython.html
A UCSF Chimera
option:
(Taken from a more complicated script I wrote for use with pychimera
here: script, pychimera)
NB, pychimera takes a little more involved installation compared to pymol, but I really like it.
The code below should work inside chimera's python interpreter though I believe with no extra configuration.
from MatchMaker import (match,
CP_BEST,
GAP_OPEN,
GAP_EXTEND,
defaults,
MATRIX,
ITER_CUTOFF)
chimera.openModels.open(model1,type="PDB")
chimera.openModels.open(model2,type="PDB")
all_models = chimera.openModels.list(modelTypes=[chimera.Molecule])
mod0 = all_models[0]
mod1 = all_models[1]
for atoms1, atoms2, rmsd, fullRmsd in match(CP_BEST,[mod0, mod1],defaults[MATRIX],
"nw",defaults[GAP_OPEN],defaults[GAP_EXTEND]):
mol0_name = atoms2[0].molecule
mol1_name = atoms1[0].molecule
printmol0_name.name + "\t" + mol2_name.name + "\t" + str(rmsd))
In Biopython, the Bio.SVDSuperimposer class will quickly align the two PDB structures and compute the RMSD. See this Bio.PDB FAQ, and search the page for "RMSD", or read the PDB section of the Biopython tutorial.
Hi, I know this might be very late to comment to it, but Bio.SVDSuperimposer does not seem to rotate the proteins to get the lowest RMSD: i get RMSD = 0.2 for chains that when aligned in PyMOL give RMSD = 0.06
here is my code:
fixed = Bio.PDB.Selection.unfold_entities(Bio.PDB.Polypeptide.PPBuilder().build_peptides(Bio.PDB.PDBParser(QUIET = True).get_structure('X' , '1A00_B.pdb') , aa_only = True)[0], 'A')
moving = Bio.PDB.Selection.unfold_entities(Bio.PDB.Polypeptide.PPBuilder().build_peptides(Bio.PDB.PDBParser(QUIET = True).get_structure('X' , '1A00_D.pdb') , aa_only = True)[0], 'A')
sup = Bio.PDB.Superimposer()
sup.set_atoms(fixed , moving)
print(sup.rms)
sup.apply(moving)
print(sup.rms)
OUTPUT:
0.199679715722
0.199679715722
the "sup.apply(moving)" line does not seem to do anything.
Is this the correct way to find the RMSD between two structures in Biopython? is there a better script to write? or a better module to use than Biopython?
The RMS in Bio.PDB.Superimposer
is calculated using transformed coordinates, so it shows the RMS of the final superimposed result, not the RMS between input arrays.
By the way, your code is not a correct way to test change of RMS, because .apply
function doesn't change anything inside Superimposer
class, it just applies computed transformation matrix to the argument. The coordinates you provide to .set_atom
method are copied to numpy
arrays inside Superimposer
class, so any change to them after that won't affect Superimposer
members.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hello j.cossio.diaz!
It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/34534589
This is typically not recommended as it runs the risk of annoying people in both communities.