Entering edit mode
9.7 years ago
fareehakanwal90
▴
30
How do I extract a specific residue from a pdb file though biopython? for example I have to extract 23,24th and 27th residue only. How do I do that?
Here is my code:
from Bio.PDB import *
from itertools import groupby
from operator import itemgetter
parser=PDBParser()
structure=parser.get_structure('X', '4wwt_loop_mcannotate.pdb')
faa=open('4wwt_loop_mcannotate.pd','w')
#fasta=[]
reslist=[]
structure=parser.get_structure('X' , '4wwt_loop_mcannotate.pdb')
for model in structure:
for chain in model:
for residue in chain:
#few=residue.get_id[1]
reslist.append(residue.get_id())
faa.write(residue.get_resname().strip())
faa.close()
residuelist=[]
for items in reslist:
for elemts in items:
if elemts==' ':
continue
else:
residuelist.append(elemts)
fraglist=[]
for k,g in groupby(enumerate(residuelist), lambda(i,x):i-x):
fraglist.append(map(itemgetter(1),g))
print fraglist
myfrags=[]
myfragment=''
for ies in fraglist:
for item in ies:
for model in structure:
for chain in model:
for residue in chain:
if item in residue.get_id():
myfragment+=residue.get_resname().strip()
myfragment+=str(item)
myfrags.append(myfragment)
myfragment=''
print myfrags
and this is the output I am getting
[[2724, 2725], [2741, 2742, 2743, 2744, 2745, 2746], [2792, 2793, 2794, 2795, 2796]]
['C2724', 'C2725', 'G2741', 'U2742', 'G2743', 'G2744', 'A2745', 'U2746', 'A2792', 'A2793', 'U2794', 'C2795', 'G2796']
while I want the output as set of fragments, like [CC,GUGGA, UAAUCG]
Please if someone could help
Can you show your pdb file? Is it taken form Protein data bank?
No it's generated by a tool. here it is: