Trouble with Biopython residue depth
0
0
Entering edit mode
5 days ago
Anand KR • 0

I want to calculate residue depth for a PDB file (5xtd). I tried using the PDBParser and the mmcifParser. I have used the ResidueDepth module (https://biopython.org/docs/1.75/api/Bio.PDB.ResidueDepth.html)

Code :

parser = PDB.MMCIFParser(QUIET=True)
structure = parser.get_structure('Complex_1', file)

selected_chains = ['s','I' ,'A']


residue_depth_calculator = ResidueDepth(structure[0], msms_exec='/usr/local/bin/msms.x86_64Darwin.2.6.1')

# Calculate residue depth
residue_data = []
for model in structure:
    for chain in model:
        if chain.id in selected_chains:  # Check if the chain is in the selected list
            for residue in chain:
                if PDB.is_aa(residue):  # Only consider amino acid residues
                    depth = residue_depth_calculator[residue]
                    residue_data.append({
                        'Residue ID': residue.get_id(),
                        'Residue Name': residue.get_resname(),
                        'Depth': depth
                    })

# Create a DataFrame from the residue data
depths_df = pd.DataFrame(residue_data)

This is the error that shows up with the residue depth in the mmcif parser.

Please let me know if there are any suggestions. Below is the output followed by the error message.

  • Output:

srdf: un sommet est faux srdf: un sommet est faux srdf: un sommet est faux srdf: un sommet est faux srdf: un sommet est faux srdf: un sommet est faux sphere_mange_arete: inconcistence sphere_mange_arete: inconcistence srdf: un sommet est faux srdf: un sommet est faux srdf: un sommet est faux sphere_mange_arete: inconcistence sphere_mange_arete: inconcistence sphere_mange_arete: inconcistence

  • Error message:

Failed to generate surface file using command: msms -probe_radius 1.5 -if /var/folders/n9/88_jl5pd3hv58lntgstxvvj40000gn/T/tmp3u1qdxxg -of /var/folders/n9/88_jl5pd3hv58lntgstxvvj40000gn/T/tmpt02cl5pq > /var/folders/n9/88_jl5pd3hv58lntgstxvvj40000gn/T/tmpl3jxiuox

I am working on a Mac and have tried with and without including a path to msms

Proteins Depth Biopython Residue • 242 views
ADD COMMENT
0
Entering edit mode

Two suggestions:

  • After I download and unpack msms on a Linux system, I can run it with ./msms.x86_64Linux2.2.6.1. I see the following that says 1994, and from reading about the mmCIF format here, you'll see it wasn't really established until 1997, and so you'd probably want to try with the PDB format. Maybe more telling, I note the BioPython documentation on ResidueDepth focuses on the PDB parser and not the mmCIF one, and the README included with msms specifies you need to use PDB to convert to xyzr or xyzrn format using shell scripts pdb_to_xyzr or pdb_to_xyzrn, respectively.

     MSMS 2.6.1 started on jupyter-fomightez-2dcl-5fdemo-2dbinder-2d4c8bvrih
     Copyright M.F. Sanner (1994)
     Compilation flags -O2 -DVERBOSE -DTIMING
     MSMS: No input stream specified
    
  • Always provide complete code works to give what you see. This current version of the shared code didn't. You want to endeavor to help those helping you. ( Please, read about a minimal reproducible example here under 'Help others reproduce the problem'.)

    To get something like you report, I needed the following, which corresponded to before your line # Calculate residue depth and so the rest was moot:

      from Bio.PDB.MMCIFParser import MMCIFParser
      parser = MMCIFParser(QUIET=True)
      file = '5xtd.cif'
      structure = parser.get_structure('Complex_1', file)
    
      selected_chains = ['s','I' ,'A']
    
      from Bio.PDB.ResidueDepth import get_surface
      model = structure[0]
      surface = get_surface(model,MSMS='./msms.x86_64Linux2.2.6.1')
    
ADD REPLY
0
Entering edit mode

To put things on an even plane with complete code:

I simplified to a single chain small protein and that works using the PDB parser.
Here is the code:

from Bio.PDB.ResidueDepth import residue_depth
from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Polypeptide import is_aa #based on https://biopython.org/docs/1.75/api/Bio.PDB.Polypeptide.html
parser = PDBParser()
structure = parser.get_structure("1crn", "1crn.pdb")
from Bio.PDB.ResidueDepth import get_surface
model = structure[0]
surface = get_surface(model) # this step will give something like the following if msms has not been downloaded and installed and set to work as `mmsms`: `RuntimeError: Failed to generate surface file using command: ; msms -probe_radius 1.5 -if /tmp/tmpilsljfx9 -of /tmp/tmpyh3l4u4w > /tmp/tmpawwp1gs7`
#residue_depth_calculator = ResidueDepth(model)

selected_chains = ['s','I' ,'A']

# Calculate residue depth
residue_data = []
for model in structure:
    for chain in model:
        if chain.id in selected_chains:  # Check if the chain is in the selected list
            for residue in chain:
                if is_aa(residue):  # Only consider amino acid residues
                    depth = residue_depth(model[chain.id][residue.get_id()], surface)
                    residue_data.append({
                        'Residue ID': residue.get_id()[1],
                        'Residue Name': residue.get_resname(),
                        'Depth': depth
                    })

# Create a DataFrame from the residue data
import pandas as pd
depths_df = pd.DataFrame(residue_data)

You can see it running in a notebook with all the necessary extra steps by clicking here. The boring static version in full context can be viewed here.

I do see an issue though with the same code and applying it to 5xtd. It doesn't seem to like the discontinuous chains and so Biopython gives warnings. That is at the bottom of the notebook I linked to above. Then I see it fail in ways that you report. Maybe someone with more familiarity with msms can comment?

You may need alternative ways to get at this. I do note that the command print {*}.surfacedistance.max works with 5xtd in Jmol, see about surfacedistance in Jmol here.

ADD REPLY

Login before adding your answer.

Traffic: 1119 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6