Question

How To Interpret The Pdb Chain And Residue Identifiers In Scop Parseable Files.

1

Entering edit mode

11.0 years ago

anveshi.charuvaka ▴ 10

I am trying to parse the SCOP parseable files, specifically dir.des.scop.txt ver 1.75. But, I have been facing problems with the PDB residue identifiers in the file. This is a tab limited file and the PDB residue and chain identifier is the 6th column which. An example of this identifier is d1kk8a2 (1kk8 A:1-28,A:77-837) ==> domain_id (pdb_id chain&range) . This particular example is straightforward, which means the domain d1kk8a2 consists of residues 1-28 and 77-837 of chain A of the corresponding PDB entry 1kk8. But some of them are unintuitive and like d3ckra1 (3ckr A:-2-385), what does the negative entry mean? And this example d2p83b1 (2p83 B:61P-385), there is a P in the range.

If you go to pdb website and search for the corresponding entries, then go to the sequence tab, you will see the alignment of the PDB chain and the corresponding SCOP domains. The ranges shown there correspond to the entries in SCOP, but it is difficult to make sense of it. Can someone please explain or provide some pointers. Thank you.

pdb • 4.9k views

ADD COMMENT • link updated 11.0 years ago by andreas.prlic ▴ 290 • written 11.0 years ago by anveshi.charuvaka ▴ 10

score 0 · Answer 1 · 2014-02-25

PDB residues are described by the residue number and an insertion code. Residue numbers can be any number, including negative values. In 3CKR the first residue has nr. -6, followed by -5, etc. This might indicate that there are additional residues at the beginning of the sequence, relative to other PDB entries. In 2P83, the first residue has the number 61P. Take a look at https://lists.sdsc.edu/pipermail/pdb-l/2004-March/001513.html for an explanation why PDB residue numbers have insertion codes.