Question

How Can I Get All The Domain Structures In A Cath Superfamily?

6

Entering edit mode

14.3 years ago

Simon Cockell 7.4k

I want to retrieve all of the PDB files for domains that belong to a particular superfamily in the CATH database.

I can get representative structures of families from the CATH download area, I can also get lists of domains and domain boundaries, so I can construct them myself by pulling files from the PDB and trimming programatically, but I was wondering if anyone here has come up with a more efficient method?

I'll proceed with writing the code and post it as an answer once I'm done, but maybe someone can provide me with a better solution in the meantime.

pdb database • 6.8k views

ADD COMMENT • link updated 9.9 years ago by Biostar 20 • written 14.3 years ago by Simon Cockell 7.4k

0

Entering edit mode

Hi, Andy from CATH here :-)

Do you mean the original whole-protein PDBs from which the domains were extracted, or the chopped PDBs we produce that just contain individual domains?

ADD REPLY • link 14.3 years ago by Andrew Clegg ▴ 310

score 12 · Answer 1 · 2010-10-07

12

Entering edit mode

14.3 years ago

Andrew Clegg ▴ 310

Sorry, ignore my (deleted) comment, I read your question again and saw the bit about trimming.

Once you have the domain list, you can hit this URL to get an individual domain:

http://www.cathdb.info/api/data/pdb/1cukA01

Where 1cukA01 is the domain in question. Or for a whole chain:

http://www.cathdb.info/api/data/pdb/1cukA

... or indeed a whole pdb the same way.

We're adding this to the docs at the moment, didn't realise it wasn't in there :-)

Also planning to add a REST service to get all the members of a superfamily via HTTP, but you can parse this out of the domain list as you spotted.

ADD COMMENT • link 14.3 years ago by Andrew Clegg ▴ 310

3

Entering edit mode

It's a couple of years since I used CATH. I'm amazed and impressed by its transformation.

ADD REPLY • link 14.3 years ago by Neilfws 49k

0

Entering edit mode

that's perfect! I love BioStar...

ADD REPLY • link 14.3 years ago by Simon Cockell 7.4k

0

Entering edit mode

Thanks Neil -- spread the love :-)

ADD REPLY • link 14.3 years ago by Andrew Clegg ▴ 310

0

Entering edit mode

PS... You should see the next version (end of this year)

ADD REPLY • link 14.3 years ago by Andrew Clegg ▴ 310

Ram · Answer 2 · 2010-10-07

6

Entering edit mode

14.3 years ago

Simon Cockell 7.4k

The code I used for the answer to this question is can be found here.

Much simpler code than I thought would be necessary, thanks to Andrew :)

ADD COMMENT • link updated 5.3 years ago by Ram 44k • written 14.3 years ago by Simon Cockell 7.4k

score 0 · Answer 3 · 2011-08-11

This answer arrives very late but it could help others.

PDBe has a service called PDBeXplore. This service cross-references data between different databases/classification systems and PDBe (one of those classification systems is CATH).

If you go to the PDBeXplore - CATH module and select a CATH class you will be able to see all the PDB entries belonging to it. You will also be able to download the whole table as a tab delimited file (right hand panel). From there, it is very easy just to parse the file and download the PDB files one by one.

Hope that helps.