Extracting Coordinates From Pdb File & Extracting The Region Of Interest
4
1
Entering edit mode
12.8 years ago
Repsy ▴ 20

Hi all,

I was trying to extract the Atom lines consist of coordinates in a PDB file. And i have written the code bellow which is not showing any output file "outputfile" when i run the program.

print" Enter the file name";
$a=<>;
@arr=split(" ",$a);
if($i=0; $i< scalar @arr; $i++)
foreach $values(@arr)
{
if($values=~/^ATOM/)
{
print FH1 $a;
open(FH1,">>output.pdb")
}
}

How ever how can i extract only a particular region consists of particular domain of interest?

perl pdb structure • 8.7k views
ADD COMMENT
4
Entering edit mode
12.8 years ago

There are several things that are causing this not to produce your expected output:

  • You are not iterating properly over your input file. Take a look at a tutorial on file input in perl and read the section on reading a file one line at a time. Here is a tutorial that might help: http://www.troubleshooters.com/codecorn/littperl/perlfile.htm
  • You should always open() your FH1 output file handle before you start processing the data, not inside the loop.
  • Always close() your FH1 file handle at the end of the loop.

In order to extract only a particular chain, you need to parse the ATOM lines for the chain ID and check to see if it's the chain ID you're interested in. You can either use split() or look at the columns of the string. You might want to look at the PDB file specification for parsing the columns. The chain ID is typically in columns 22-23. But you may find using the ATOM numbers better, which are columns 8-11.

ADD COMMENT
0
Entering edit mode

well thanks for the tutorial... and if the domain of interest is a loop region instead of a chain?

ADD REPLY
0
Entering edit mode

For a loop region you can use the atom numbers (roughly columns 8-11) or the residue numbers (columns 25-27) if it's a monomeric protein (if it's oligomeric you would be extracting those residues from each monomer).

ADD REPLY
0
Entering edit mode

If your selection is more complex than just a range of atom numbers, it might be better to use VMD like @dimkal suggests.

Bioperl is nice for dealing with sequence data, but as far as I know it won't help you with atom selections.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Also +1 for the answer; questioner really needs to understand the basics of loops, file open, read and write in Perl.

ADD REPLY
0
Entering edit mode

Also to respond to "what if the region of interest is in a loop instead of a chain", in PDB/protein structure parlance a chain generally refers to a monomer or individual peptide subunit of the resolved structure. So say a homotetramer is composed of four chains, each of which represents one monomer in the structure. There is no secondary structure unit typically refered to as a chain.

ADD REPLY
0
Entering edit mode
12.8 years ago
dimkal ▴ 730

I suggest you use a tool built to work with PDB files. Look into VMD, which has a TCL-based scripting language designed specifically to work with protein structure.

So say you want to isolate protein's chain B, residues 1-100 to a file. You'd command that looks something like this:

vmd> mol load pdb 1abc.pdb
vmd> set peptide [atomselect top "protein and chain B and resid 1 to 100"]
vmd> $peptide writepdb peptB.pdb
ADD COMMENT
0
Entering edit mode
12.8 years ago
Woa ★ 2.9k

Maybe Perlmol and/or Bioperl can be helpful:

How To Extract Just The Coordinate Values From A Pdb File, Using Perl Only?

ADD COMMENT
0
Entering edit mode
12.7 years ago

To extract ATOM Coordinates, Either You can use substring concept or u can use range function in Perl. This code will help you. It will create a text file with x y z coordinates. if u need any format you can use FORMAT STDOUT.

open(FILE,"PDBID.pdb");
open(OUT,">output.txt");

@file=<FILE>;
foreach (@file)
{
if (/^ATOM/)
{
@aa= split('',$_);
@x=@aa[31..38];
$x= join('',@x);
$x=~ s/\s//g;
  @y=@aa[39..46];
$y= join('',@y);
$y=~ s/\s//g;
  @z=@aa[47..54];
$z= join('',@z);
$z=~ s/\s//g;
print OUT "$x  $y   $z\n";
}
}
}}
ADD COMMENT
0
Entering edit mode

can you tell how to calculate distance between all to all atom by using this

ADD REPLY

Login before adding your answer.

Traffic: 2357 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6