Hi all,
I was trying to extract the Atom lines consist of coordinates in a PDB file. And i have written the code bellow which is not showing any output file "outputfile" when i run the program.
print" Enter the file name";
$a=<>;
@arr=split(" ",$a);
if($i=0; $i< scalar @arr; $i++)
foreach $values(@arr)
{
if($values=~/^ATOM/)
{
print FH1 $a;
open(FH1,">>output.pdb")
}
}
How ever how can i extract only a particular region consists of particular domain of interest?
well thanks for the tutorial... and if the domain of interest is a loop region instead of a chain?
For a loop region you can use the atom numbers (roughly columns 8-11) or the residue numbers (columns 25-27) if it's a monomeric protein (if it's oligomeric you would be extracting those residues from each monomer).
If your selection is more complex than just a range of atom numbers, it might be better to use VMD like @dimkal suggests.
Bioperl is nice for dealing with sequence data, but as far as I know it won't help you with atom selections.
Actually, Bioperl has some useful modules for PDB analysis; see for example http://search.cpan.org/~cjfields/BioPerl-1.6.901/Bio/Structure/IO.pm, http://search.cpan.org/~cjfields/BioPerl-1.6.901/Bio/Structure/Atom.pm.
Also +1 for the answer; questioner really needs to understand the basics of loops, file open, read and write in Perl.
Also to respond to "what if the region of interest is in a loop instead of a chain", in PDB/protein structure parlance a chain generally refers to a monomer or individual peptide subunit of the resolved structure. So say a homotetramer is composed of four chains, each of which represents one monomer in the structure. There is no secondary structure unit typically refered to as a chain.