How to edit .PDB files fast ?
2
1
Entering edit mode
8.6 years ago
ReWeeda ▴ 120

Good Morning,

I'm a student of Bioinformatics from the University of Bologna.

I'm working on a personal idea and I'm using MUSTANG to perform multiple structural alignment. To run the algorithm .pdb file with just one chain are required so I'm looking for tools or a suggestion in order to edit in a clever and fast way all the files.

'til now I wrote some command lines in python but I think that they're not enough specific to correctly edit all the files in fact I often have to check manually all the structures again.

someone could help me?

thanks in advance! Dade.

pdb editing • 6.5k views
ADD COMMENT
3
Entering edit mode
8.6 years ago
gearoid ▴ 200
"""
Extract a single chain from a PDB file
"""

from __future__ import print_function
import Bio.PDB
import Bio.PDB.PDBIO
import sys
import argparse


class ChainSelect(Bio.PDB.Select):
    def __init__(self, target_chain):
        self.target_chain = target_chain
    def accept_chain(self, chain):
        if chain.get_id() == self.target_chain:
            return 1
        else:
            return 0

def main():
    argparser = argparse.ArgumentParser(description="Extract chain from a PDB file")
    argparser.add_argument('infile', help="Path to input file (PDB)")
    argparser.add_argument('chain', help="Chain to extract")
    argparser.add_argument('outfile', help="Path to output file (PDB)")
    args = argparser.parse_args()

    pdbparser = Bio.PDB.PDBParser()
    io = Bio.PDB.PDBIO()
    with open(args.infile, 'r') as infile:
        struct = pdbparser.get_structure(args.infile, infile)
        io.set_structure(struct)
    with open(args.outfile, 'w') as outfile:
        io.save(outfile, ChainSelect(args.chain))
    return 0

if __name__=="__main__":
    sys.exit(main())

Run it with something like:

python extract_chain.py 1XXX.pdb A 1XXXA.pdb

to extract just chain A from 1XXX.pdb.

ADD COMMENT
0
Entering edit mode

Using UNIX commands

grep '^ATOM' 1XXX.pdb | awk '$5=="A"' > 1XXXA.pdb
ADD REPLY
0
Entering edit mode

As far as I know, the PDB file format defines the fields based on specific columns rather than whitespace. So previous fields in the atom record aren't guaranteed to be separated by whitespace. So this command will seem like it's working, but silently fail to extract certain atoms from the chain you want.

You can see in the example here that some of the atoms have the chain in $4 and some in $5: http://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM

ADD REPLY
0
Entering edit mode

you're right.

Nevertheless I've adopted another solution merging different ideas retrieved on the web. Now I've just to write a shell script to run my .py over all the structures with just one click.

ADD REPLY
0
Entering edit mode
8.6 years ago
ReWeeda ▴ 120

It works perfectly! Thanks!

ADD COMMENT

Login before adding your answer.

Traffic: 1684 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6