Question

Implementation of python function

1

Entering edit mode

3.8 years ago

anasjamshed ▴ 140

I need to make a python function called superimpose which will do the following:

a) Use Bio.PDB and NumPy for implementation

b) The function will take 7 arguments like : 1) PDB identifier of string length 4 2) A chain identifier of string length 1 3) list of residue numbers 4) A second PDB identifier of string length 4 5) A 2nd chain identifier of string length 1 6) list of residue numbers(list of integers n) 7) The name of an output PDB file(a string)

I have tried this script :

import NumPy as np
import Bio
from Bio.PDB import * 

def superimpose(pdb_ident, chain_id, list_res = [],pdb_ident_2= "", chain_id_2= "" , list_res_2 = [],outfile):

But now I am confuse how can I proceed further . Can anyone help me plz

DNA python PDB • 2.4k views

ADD COMMENT • link updated 3.7 years ago by Biostar 20 • written 3.8 years ago by anasjamshed ▴ 140

3

Entering edit mode

You have been on this site long enough to know that it is not meant for helping with homework. And you don't have a script - you only converted the requirements of your homework into 3 imports and one line that begins to define a function.

Google is your friend here. A trivial search with biopython function superimpose as keywords will yield plenty of information to get you going.

ADD REPLY • link 3.8 years ago by Mensur Dlakic ★ 28k

0

Entering edit mode

I am stuck in 7 arguments. How can I declare local variables inside a function

ADD REPLY • link 3.8 years ago by anasjamshed ▴ 140

0

Entering edit mode

Can anyone plz help me? It's urgent

ADD REPLY • link 3.8 years ago by anasjamshed ▴ 140

1

Entering edit mode

@Mensur Dlakic pointed you in the right direction: Google is your friend here. Hint: python variables function. Even if you don't know any python, taking this step can help you.

ADD REPLY • link 3.8 years ago by seidel 11k

0

Entering edit mode

I have tried this:

 import Bio.PDB
    import numpy

    pdb_code = "2xmf"
    pdb_filename = "%s.pdb" % pdb_code
    pdb_out_filename = "%s_aligned.pdb" % pdb_code

    seq_str = 'MAAGVKQLADDRTLLMAGVSHDLRTPLTRIRLATEMMSEQDGYLAESINKDIEECNAIIEQFIDYLR'
    use_str = '-----------RTLLMAGVSHDLRTPLTRIRLATEMMSEQDGYLAESINKDI---------------'
    use = [letter!="-" for letter in use_str]
    assert len(use) == len(seq_str) 

print("Loading PDB file %s" % pdb_filename)
structure = Bio.PDB.PDBParser().get_structure(pdb_code, pdb_filename)

print("Everything aligned to first model...")
ref_model = structure
for alt_model in structure :
    #Build paired lists of c-alpha atoms, ref_atoms and alt_atoms
    #[code shown later]

    #Align these paired atom lists:
    super_imposer = Bio.PDB.Superimposer()
    super_imposer.set_atoms(ref_atoms, alt_atoms)

    if ref_model.id == alt_model.id :
        #Check for self/self get zero RMS, zero translation
        #and identity matrix for the rotation.
        assert numpy.abs(super_imposer.rms) < 0.0000001
        assert numpy.max(numpy.abs(super_imposer.rotran[1])) < 0.000001
        assert numpy.max(numpy.abs(super_imposer.rotran[0]) - numpy.identity(3)) < 0.000001
    else :
        #Update the structure by moving all the atoms in
        #this model (not just the ones used for the alignment)
        super_imposer.apply(alt_model.get_atoms())

    print("RMS(first model, model %i) = %0.2f" % alt_model.id, super_imposer.rms))

but the output is giving me an error:

ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-9-95fdfda2d723> in <module>()
     10     #Align these paired atom lists:
     11     super_imposer = Bio.PDB.Superimposer()
---> 12     super_imposer.set_atoms(ref_atoms, alt_atoms)
     13 
     14     if ref_model.id == alt_model.id :

~\Anaconda3\lib\site-packages\Bio\PDB\Superimposer.py in set_atoms(self, fixed, moving)
     43         sup = SVDSuperimposer()
     44         sup.set(fixed_coord, moving_coord)
---> 45         sup.run()
     46         self.rms = sup.get_rms()
     47         self.rotran = sup.get_rotran()

~\Anaconda3\lib\site-packages\Bio\SVDSuperimposer\__init__.py in run(self)
    150         reference_coords = self.reference_coords
    151         # center on centroid
--> 152         av1 = sum(coords) / self.n
    153         av2 = sum(reference_coords) / self.n
    154         coords = coords - av1

ZeroDivisionError: division by zero

ADD REPLY • link updated 3.8 years ago by Ram 44k • written 3.8 years ago by anasjamshed ▴ 140

1

Entering edit mode

Did you try Googling the error message along with the function names from the stack trace? Learning to read and decode error messages is a huge part of building applications, large or small.

ADD REPLY • link 3.8 years ago by Ram 44k

0

Entering edit mode

I have tried using stack exchange but unable to solve

ADD REPLY • link 3.8 years ago by anasjamshed ▴ 140

0

Entering edit mode

Did you zero in on what the value of alt_model is when the script fails? How it is different from its previous value? Maybe used the stack trace to look at particular characteristics of the object? Please tell us what you have tried.

ADD REPLY • link 3.8 years ago by Ram 44k

0

Entering edit mode

the reference model is a myosin I have tried this script to store atoms:

ref_atoms = []
alt_atoms = []
for (ref_chain, alt_chain) in zip(ref_model, alt_model) :
    for ref_res, alt_res, amino, allow in zip(ref_chain, alt_chain, seq_str, use) :
            assert ref_res.resname== alt_res.resname
            assert ref_res.id      == alt_res.id
            assert amino == Bio.PDB.Polypeptide.three_to_one(ref_res.resname)
            if allow :
                #CA = alpha carbon
                ref_atoms.append(ref_res['CA'])                
                alt_atoms.append(alt_res['CA'])

but it is given the following error :

AttributeError                            Traceback (most recent call last)
<ipython-input-11-830aa186fc87> in <module>()
      3 for (ref_chain, alt_chain) in zip(ref_model, alt_model) :
      4     for ref_res, alt_res, amino, allow in zip(ref_chain, alt_chain, seq_str, use) :
----> 5             assert ref_res.resname== alt_res.resname
      6             assert ref_res.id      == alt_res.id
      7             assert amino == Bio.PDB.Polypeptide.three_to_one(ref_res.resname)

AttributeError: 'Chain' object has no attribute 'rename'

ADD REPLY • link 3.8 years ago by anasjamshed ▴ 140

0

Entering edit mode

That looks to me as if you could have a typo in your code (rename instead of resname). However, I cannot see this in the code you posted...

ADD REPLY • link 3.8 years ago by cschu181 ★ 2.8k

0

Entering edit mode

I have tried this now :

import NumPy as np
import Bio
from Bio.PDB import * 

def superimpose(pdb_code_1, chain_id_1, list_res_1,  pdb_code_2, chain_id2, list_res2):
    # Select what residues numbers you wish to align
    # and put them in a list
    start_id = 1
    end_id   = 70
    atoms_to_be_aligned = range(start_id, end_id + 1)
    pdb_code_1= "1d3z"
    pdb_filename_1 = "%s.pdb" % pdb_code_1
    pdb_out_filename = "%s_aligned.pdb" % pdb_code_1
    pdb_code_2= "1ubq"
    pdb_filename_2 = "%s.pdb" % pdb_code_2
    # Start the parser
    pdb_parser = Bio.PDB.PDBParser(QUIET = True)
    # Get the structures
    ref_structure = pdb_parser.get_structure("reference", pdb_filename_1)
    sample_structure = pdb_parser.get_structure("sample", pdb_filename_2)
    # Use the first model in the pdb-files for alignment
    # Change the number 0 if you want to align to another structure
    ref_model    = ref_structure[0]
    sample_model = sample_structure[0]
    # Make a list of the atoms (in the structures) you wish to align.
    # In this case we use CA atoms whose index is in the specified range
    list_res_1 = []
    list_res_2 = []
    # Iterate of all chains in the model in order to find all residues
    for ref_chain in ref_model:
        # Iterate of all residues in each model in order to find proper atoms
        for ref_res in ref_chain:
            # Check if residue number ( .get_id() ) is in the list
            if ref_res.get_id()[1] in atoms_to_be_aligned:
                # Append CA atom to list
                ref_atoms.append(ref_res['O'])
                # Do the same for the sample structure
                for sample_chain in sample_model:
                    for sample_res in sample_chain:
                        if sample_res.get_id()[1] in atoms_to_be_aligned:
                            sample_atoms.append(sample_res['O'])
                            # Now we initiate the superimposer:
                            super_imposer = Bio.PDB.Superimposer()
                            super_imposer.set_atoms(ref_atoms, sample_atoms)
                            super_imposer.apply(sample_model.get_atoms())
                            # Print RMSD:
                            print (list_res_1,list_res2)

print(superimpose("1d3z","A",[1,2,3,4],"1ubq","A",[1,2,3,4]))

but it gives me the following error :

PDBException: Fixed and moving atom lists differ in size

ADD REPLY • link 3.8 years ago by anasjamshed ▴ 140

0

Entering edit mode

Google the error message and try to understand what is happening. Or see if you have a programmer friend that can help you out. We cannot debug your code in this piecemeal fashion, especially when it looks like as assignment/exercise question. Use this opportunity to learn debugging.

ADD REPLY • link 3.8 years ago by Ram 44k

0

Entering edit mode

I want help that's why I posted it here. You can see that I am trying a lot so anyone should help me

ADD REPLY • link 3.8 years ago by anasjamshed ▴ 140