How Does One Programmatically (Python) Download Pdb Structures By Keyword
2
1
Entering edit mode
10.7 years ago
Burke ▴ 290

I would like to download all hemagglutinin structures for influenza virus from the Protein Data Bank via a python script. I have looked through the PDB and BioPython PDB package on how to do this with no luck. Does anyone know if this is possible?

pdb python search • 19k views
ADD COMMENT
6
Entering edit mode
10.7 years ago
User000 ▴ 720

You need to know all PDB IDs you want to download though and list them, the program will download them automatically. You have to go to PDB, search what you are interested in, select all IDs you think are relevant to you, go to Reports--->List selected IDs

import Bio
from Bio.PDB import PDBList
'''Selecting structures from PDB'''
pdbl = PDBList()
PDBlist2=['4B97','4IPH','4HNO','4HG7','4IRG','4G4W','4JKW','4IPC','2YPM','4KEI']
for i in PDBlist2:
    pdbl.retrieve_pdb_file(i,pdir='PDB')
ADD COMMENT
0
Entering edit mode

How do i download the PDB IDs of an entire set of soluble enzymes from PDB and then select only the non membrane bound enzymes form the list while removing redundant sequences by keeping only the highest resolution sequences ?

ADD REPLY
2
Entering edit mode
10.7 years ago

Have a look at the PDB's REST APIs, at their documentation and at this example python program provided in the same site.

ADD COMMENT
0
Entering edit mode

link is gone, what is the api now ? any chance get a documented python client for it ?

ADD REPLY

Login before adding your answer.

Traffic: 2306 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6