Need Help To Find The Error In An Edited Pdb File
1
1
Entering edit mode
11.5 years ago
prt-hntr ▴ 10

Hi all, I hope some expert is there with pdb format. I have small motif with in pdb format. inside the pdb file i want to change Amino Acid residue into a user define name. I did that using awk and i found that orginal file opens in chimera, jmol etc but the changed file is not opening. If i manually replace AA residue then it can works like orginal motif. Please help me in finding the error in awk edited file comparing to orginal pdb file for example:

Orginal file

ATOM   1243  N   TYR B 105      65.157  40.402 -13.263  1.00 13.49           N  
ATOM   1244  CA  TYR B 105      64.938  40.910 -11.922  1.00 13.31           C  
ATOM   1245  C   TYR B 105      64.747  39.748 -10.950  1.00 14.10           C  
ATOM   1246  O   TYR B 105      65.292  39.756  -9.849  1.00 14.27           O  
ATOM   1247  CB  TYR B 105      63.710  41.804 -11.884  1.00 10.62           C  
ATOM   1248  CG  TYR B 105      63.439  42.357 -10.511  1.00 11.78           C  
ATOM   1249  CD1 TYR B 105      63.921  43.597 -10.128  1.00 11.65           C  
ATOM   1250  CD2 TYR B 105      62.682  41.640  -9.598  1.00 14.01           C  
ATOM   1251  CE1 TYR B 105      63.648  44.113  -8.869  1.00 13.06           C  
ATOM   1252  CE2 TYR B 105      62.401  42.149  -8.338  1.00 15.13           C  
ATOM   1253  CZ  TYR B 105      62.882  43.386  -7.981  1.00 13.16           C  
ATOM   1254  OH  TYR B 105      62.548  43.905  -6.748  1.00 14.20           O  
ATOM   1255  N   GLY B 106      63.959  38.758 -11.352  1.00 13.40           N  
ATOM   1256  CA  GLY B 106      63.744  37.615 -10.491  1.00 14.83           C  
ATOM   1257  C   GLY B 106      65.044  36.864 -10.231  1.00 15.24           C  
ATOM   1258  O   GLY B 106      65.287  36.428  -9.107  1.00 13.96           O

Edited Using AWK

ATOM   1243  N     A01 B 105      65.157  40.402 -13.263  1.00 13.49           N       
ATOM   1244  CA     A01 B 105      64.938  40.910 -11.922  1.00 13.31           C       
ATOM   1245  C     A01 B 105      64.747  39.748 -10.950  1.00 14.10           C       
ATOM   1246  O     A01 B 105      65.292  39.756  -9.849  1.00 14.27           O       
ATOM   1247  CB     A01 B 105      63.710  41.804 -11.884  1.00 10.62           C       
ATOM   1248  CG     A01 B 105      63.439  42.357 -10.511  1.00 11.78           C       
ATOM   1249  CD1 A01 B 105      63.921  43.597 -10.128  1.00 11.65           C       
ATOM   1250  CD2 A01 B 105      62.682  41.640  -9.598  1.00 14.01           C       
ATOM   1251  CE1 A01 B 105      63.648  44.113  -8.869  1.00 13.06           C       
ATOM   1252  CE2 A01 B 105      62.401  42.149  -8.338  1.00 15.13           C       
ATOM   1253  CZ     A01 B 105      62.882  43.386  -7.981  1.00 13.16           C       
ATOM   1254  OH     A01 B 105      62.548  43.905  -6.748  1.00 14.20           O      
ATOM   1255  N     A02 B 106      63.959  38.758 -11.352  1.00 13.40           N       
ATOM   1256  CA     A02 B 106      63.744  37.615 -10.491  1.00 14.83           C       
ATOM   1257  C     A02 B 106      65.044  36.864 -10.231  1.00 15.24           C       
ATOM   1258  O     A02 B 106      65.287  36.428  -9.107  1.00 13.96           O
protein protein-structure • 2.5k views
ADD COMMENT
3
Entering edit mode
11.5 years ago

The problem is very likely that you are mixing spaces and tabs or that the format is incorrect. The PDB file is a fixed width format where the various fields must occur at exact character counts.

In your original file this may not be visible because the tab is as wide as 4 spaces. Once I formatted your example you can see the difference yourself.

Make sure that there are no tabs and that the columns line up exactly.

EDIT: instead of awk try using something like sed to only modify small parts, awk will split and merge the consecutive whitespace characters

ADD COMMENT
1
Entering edit mode

As Istvan notes the PDB file format uses fixed width columns, for details see the format documentation on http://www.wwpdb.org/docs.html.

The fixed width nature of the columns in the PDB format causes problems for large structures where the specific columns may not be wide enough. The PDBML and mmCIF formats avoid this problem, and will be used for all of the large structures in future (see Deposition and Release of PDB Entries Containing Large Structures).

ADD REPLY

Login before adding your answer.

Traffic: 1810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6