Entering edit mode
2.5 years ago
user366312
▴
20
I am trying to parse PDB files.
Say, a PDB file has the following data:
ATOM 33 N ATHR A 2 4.935 -11.632 15.046 0.74 2.95 N
ATOM 34 N BTHR A 2 5.078 -11.406 15.180 0.31 2.78 N
ATOM 35 CA ATHR A 2 5.757 -11.521 13.850 0.81 3.02 C
ATOM 36 CA BTHR A 2 5.773 -11.153 13.921 0.20 2.67 C
ATOM 37 C ATHR A 2 7.070 -10.839 14.210 0.74 2.82 C
ATOM 38 C BTHR A 2 7.155 -10.559 14.193 0.29 1.80 C
ATOM 39 O ATHR A 2 7.152 -9.941 15.050 0.80 3.31 O
ATOM 40 O BTHR A 2 7.214 -9.641 15.012 0.25 2.41 O
ATOM 41 CB ATHR A 2 4.976 -10.693 12.813 0.87 5.53 C
ATOM 42 CB BTHR A 2 4.896 -10.354 12.941 0.25 12.07 C
ATOM 43 OG1ATHR A 2 4.611 -9.432 13.388 1.00 6.88 O
ATOM 44 OG1BTHR A 2 3.743 -11.083 12.501 0.25 9.57 O
ATOM 45 CG2ATHR A 2 3.858 -11.584 12.293 0.75 10.03 C
ATOM 46 CG2BTHR A 2 5.683 -9.885 11.726 0.27 5.90 C
ATOM 47 H ATHR A 2 4.547 -10.814 15.527 0.75 3.44 H
ATOM 48 H BTHR A 2 5.510 -10.211 15.754 0.25 2.90 H
ATOM 49 HA ATHR A 2 5.962 -12.339 13.548 0.75 3.32 H
ATOM 50 HA BTHR A 2 4.036 -9.929 13.477 0.25 2.86 H
ATOM 51 HB ATHR A 2 5.648 -10.589 11.938 0.75 5.43 H
ATOM 52 HB BTHR A 2 4.644 -9.326 13.574 0.25 5.67 H
ATOM 53 HG1ATHR A 2 5.030 -9.344 14.216 0.75 8.74 H
ATOM 54 HG1BTHR A 2 3.236 -11.198 13.399 0.25 10.21 H
ATOM 55 HG21ATHR A 2 4.096 -12.441 11.924 0.75 10.92 H
ATOM 56 HG21BTHR A 2 6.542 -9.278 12.024 0.25 9.66 H
ATOM 57 HG22ATHR A 2 3.222 -10.974 11.650 0.75 10.92 H
ATOM 58 HG22BTHR A 2 5.039 -9.142 11.179 0.25 9.66 H
ATOM 59 HG23ATHR A 2 3.163 -11.738 13.200 0.75 10.92 H
ATOM 60 HG23BTHR A 2 5.904 -10.639 11.169 0.25 9.66 H
We see that there are many alternative atoms in the 2nd residue.
How should I choose the atoms?
Should I just randomly choose one atom from the alternatives? How can my parser learn to differentiate between those two atoms (say, 33 and 34)? Coz, from the parsing point of view, there is no indication that they are related.
How can I parse alternative atom information in a PDB file?
It may help your parsing endeavor if you reviewed alternate locations and the columns related to this in the PDB format. Particularly your statement, "from the parsing point of view, there is no indication that they are related", needs reconsidering. See Proteopedia's page on 'Alternate locations' about the 17th column, and also the occupancy value.
That page also describes a way to visualize the alternate conformations of a PDB file using FirstGlance in Jmol. And the information included in FirstGlance about the examples of alternate locations goes beyond the brief coverage on the associated Proteopedia page.
Often the associated publications can have information about the different conformations, too.