Hi everyone,
I need a list of which PDB chains are sequence-unique.
For example. In PDB entry 2r9r there are 4 chains. But some have equal SEQRES sequences. See here. It says:
The structure 2R9R has in total 4 chains. These are represented by 2 sequence-unique entities.
The chains are A,B,G,H where A is equal to G and B equal to H. I want the following sets:
2r9r: (A,G), (B,H)
The naive approach would be downloading the FASTAs and doing the check myself. This file might be a suitable source:
ftp://ftp.wwpdb.org/pub/pdb/deriveddata/pdbseqres.txt
I'm wondering though, whether such a list does not exist already? I'm also open for suggestions on how to better built it oneself.
*Edit: To clarify again: I can compile this myself, the main point is more whether the list is not available already. That would be more convenient than iterating the full sequence data every time something changes.
Thanks,
Jonas
Hi, not exactly. I tried to clarify in the question. What I would expect your command line to return is basically 2r9r: A, B. Interestingly it only returns 2r9r_B, maybe I don't completely understand it. Anyway what I need is for every chain id which other chain ids it is equal to. I hope that makes it clearer?