Question

How To Download Atomic Coordinates For A Specific Chain From A Pdb File?

0

Entering edit mode

12.4 years ago

edweena ▴ 10

Hi,

What is the easiest way to download atomic coordinates for specific chains from multiple PDB files?

pdb • 8.0k views

ADD COMMENT • link updated 5.8 years ago by M.O.L.S ▴ 100 • written 12.4 years ago by edweena ▴ 10

score 1 · Answer 1 · 2012-07-19

1

Entering edit mode

12.4 years ago

Pappu ★ 2.1k

The following should work fine. You can use Biopython for more flexibility.

for i in 3EML 2VTP 2VEZ 
do 
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$i -O $i.pdb
grep  ATOM $i.pdb | grep ' A  ' > $i\_A.pdb
done

ADD COMMENT • link 12.4 years ago by Pappu ★ 2.1k

0

Entering edit mode

Thank you for your answer. Using your code, I got the atomic coordinates of A chain. If I need to get B chain from the first PDB file, C chain from the second one, how to change your code? sorry, I am new to perl. Could you give the name of the biopython module?

ADD REPLY • link 12.4 years ago by edweena ▴ 10

0

Entering edit mode

Replace A with B and so on. Read bash and grep manual.

ADD REPLY • link 12.4 years ago by Pappu ★ 2.1k

score 1 · Answer 2 · 2014-03-30

1

Entering edit mode

10.7 years ago

Woa ★ 2.9k

It is better IMHO to download the entire PDB file, containing all the chain co-ordinates and then parse using some dedicated PDB parsers of Biopython, Bioperl or R (Bio-3d package) rather than any regex type parsers.

ADD COMMENT • link 10.7 years ago by Woa ★ 2.9k

score 0 · Answer 3 · 2014-03-30

Assuming that you are just interested in the ATOM records then you will need to take into account the structure of the ATOM records. These are not structured as delimited fields, but are instead position based, thus the identification of the specific chain records must look at a specific position in the record to get the correct ATOM records.

Something like the following Perl script will do the job:

#!/usr/bin/env perl

# Enable Perl warnings
use strict;
use warnings;

# Load modules.
use File::Basename;
use Getopt::Long qw(:config no_ignore_case bundling);
use LWP;

### Default parameters.
# Base URL for dbfetch service used to fetch PDB entries.
my $dbfetchBaseUrl = 'http://www.ebi.ac.uk/Tools/dbfetch/dbfetch';
# PDB entry identifier.
my $pdbEntryIdListStr = '3E3Q,3E3Q,3EML';
# PDB chain identifier.
my $pdbChainIdListStr = 'A,a,A';

my $scriptName = basename( $0, () );
my $usage = <<EOF Extract="" ATOM="" records="" for="" a="" specified="" chain="" from="" a="" PDB="" entry.="" Usage:="" $scriptName="" -h="" $scriptName="" [-e="" pdbId]="" [-c="" chainId]="" -e,="" --entry="" PDB="" entry="" identifier="" list.="" [$pdbEntryIdListStr]="" -c,="" --chain="" PDB="" chain="" identifier="" list="" (case="" sensitive).="" [$pdbChainIdListStr]="" The="" list="" of="" entry="" Ids="" and="" chains="" Ids="" are="" assumed="" to="" be="" parallel,="" so="" to="" get="" the="" ATOM="" records="" for="" 3E3Q="" chain="" 'A',="" 3E3Q="" chain="" 'a'="" and="" 3EML="" chain="" 'A'="" the="" parameters="" would="" be:="" $scriptName="" -e="" 3E3Q,3E3Q,3EML="" -c="" A,a,A="" -h,="" --help="" Help="" usage="" message.="" EOF="" ;="" my="" $helpFlag;="" GetOptions(="" 'entry|e="s'" ==""> \$pdbEntryIdListStr,
    'chain|c=s' => \$pdbChainIdListStr,
    'help|h' => \$helpFlag,
    );

if($helpFlag) {
    print $usage;
    exit(0);
}

my @pdbEntryIdList = split(/[ ,]+/, $pdbEntryIdListStr);
my @pdbChainIdList = split(/[ ,]+/, $pdbChainIdListStr);
if(scalar(@pdbEntryIdList) != scalar(@pdbChainIdList)) {
    print STDERR @pdbEntryIdList;
    print STDERR @pdbChainIdList;
    die 'PDB entry Id list and PDB chain Id list lengths differ.';
}
for(my $i = 0; $i < scalar(@pdbChainIdList); $i++) {
    my $pdbEntryId = $pdbEntryIdList[$i];
    my $pdbChainId = $pdbChainIdList[$i];
    # Get the entry.
    my $pdbEntry = &fetchEntry('pdb', $pdbEntryId);
    if($pdbEntry) {
        # Extract 'ATOM' records.
        my @records = split(/\n+/, $pdbEntry);
        my @header_records = grep(/^HEADER/, @records);
        my @atom_records = grep(/^ATOM  .{14} $pdbChainId/, @records);
        print @header_records;
        print @atom_records;
    }
}

sub fetchEntry {
    my $dbName = shift;
    my $entryId = shift;
    # Create an LWP UserAgent for making HTTP calls.
    my $ua = LWP::UserAgent->new();
    # Configure HTTP proxy support from environment.
    $ua->env_proxy;

    my $requestUrl = $dbfetchBaseUrl . '/' . $dbName . '/' . $entryId . '?style=raw';
    print STDERR $requestUrl, "\n";
    my $response = $ua->get($requestUrl);
    if($response->is_error) {
        die 'http status: ' . $response->code . ' ' . $response->message;
    }
    my $content = $response->content();
    if($content =~ m/^ERROR/) {
        warn $content;
    }
    return $content;
}

score 0 · Answer 4 · 2019-02-11

0

Entering edit mode

5.8 years ago

M.O.L.S ▴ 100

First, Specify which chain you want to look at.

Different PDB files may have different numbers of chains.

Use the chain methods to get the atoms and atomic coordinates.

If you are looking at multiple PDB files you have to find a way to wrap that together in a for loop.

ADD COMMENT • link 5.4 years ago by M.O.L.S ▴ 100