perl fix code (Help :) )
1
0
Entering edit mode
5.6 years ago

i have got a big list with proteins like this below:

ID   140U_DROME              Reviewed;         261 AA.
AC   P81928; Q9VFM8;
FT   CHAIN         1    261       RPII140-upstream gene protein.
FT                                /FTId=PRO_0000064352.
FT   TRANSMEM     67     87       Potential.
FT   TRANSMEM    131    151       Potential.
FT   TRANSMEM    183    203       Potential.
FT   CONFLICT     64     64       S -> F (in Ref. 1).
SQ   SEQUENCE   261 AA;  29182 MW;  5DB78CF6CFC4435A CRC64;
     MNFLWKGRRF LIAGILPTFE GAADEIVDKE NKTYKAFLAS KPPEETGLER LKQMFTIDEF
     GSISSELNSV YQAGFLGFLI GAIYGGVTQS RVAYMNFMEN NQATAFKSHF DAKKKLQDQF
     TVNFAKGGFK WGWRVGLFTT SYFGIITCMS VYRGKSSIYE YLAAGSITGS LYKVSLGLRG
     MAAGGIIGGF LGGVAGVTSL LLMKASGTSM EEVRYWQYKW RLDRDENIQQ AFKKLTEDEN
     PELFKAHDEK TSEHVSLDTI K

and the format i want to receive is:

 >ADBR2_HUMAN|P07550|413aa
MGQPGNGSAFLLAPNGSHAPDHDVTQERDEVWVVGMGIVMSLIVLAIVFGNVLVITAIAKFERLQTVTN
----------------------------------MMMMMMMMMMMMMMMMMMMMMMMM-----------
//

can sοmeone fix my code please?

#!/usr/bin/perl 
use strict;

use warnings;  

open(IN, "<transmem_proteins.swiss") or die "i can not open the transmem_proteins.swiss, $!";
open (OUT, ">askisi1.txt");


while (<IN>)
{   

if ($_=~m/^ID\s{3} (\w+_\w+)\s / )
    $id=$1;
    print OUT "$id ID\n";


if ($_=~m/^AC\s{3} (\w+)\w{1} and $ac==0)
    $ac=$$1;
    $ac++;

if ($_=~m/^SQ\s{3} SEQ\s{3}(\w+)\s{1}AA)
    $aa = $1;
    print OUT "$aa aa\n";

if ($_=~m/^\s{5}(.*)\n/)
{
    seq = $seq.$1;
    }
if ($_=~m/^FT\s{3}TRANSMEM\s+(\w+)\s+(\w+)/)
    if($start[0] == NULL){
    shift (@start);
    }
    if ($end[0] == NULL){

    shift (@end);
    }
    push @start, $s1;
    push @end, $s2;
    $number_of_transmem++;

    $seq = ~s/\s//g;
    if ($_=~m/\/\//;

if ($tr_table[0]=NULL){
shift (@tr_table);
}   
for ($a=0; $a<$aa; $a++){
$tr_table[$a]='-';
}
for ($tr=0; $tr<$number_of_transmem; $tr++)
    for ($a=0; $a,aa; $a++)
    if (($a+1)>= $start[$tr] and ($a+1)<=$end[$tr]){
    $tr_table[$a]='M';
}
print OUT @tr_table;
print OUT "\n";
print OUT "//\n";

$ac=0;
$number_of_transmem=0;
$seq=~s/.*//;
@start=NULL;
@end=NULL;
@tr_table=NULL;
}
coding perl • 1.1k views
ADD COMMENT
4
Entering edit mode
5.6 years ago
AK ★ 2.2k

Hi Elza Chotza,

Although 93% of Paint Splatters are Valid Perl Programs, you still have too many typos in your codes (and other things such as missing braces of if statements and a right slash in regular expression...), and $seq = ~s/\s//g should be written as $seq =~ s/\s//g. A quick check by PerlTidy returned:

PerlTidy: Errors reported by perltidy during last run
=====================================================

14: $id=$1;
    ^
found Scalar where operator expected

Missing ';' above?

30: if ($_=~m/^FT\s{3}TRANSMEM\s+(\w+)\s+(\w+)/)
                               ----------^
found ( where operator expected (previous token underlined)

11: {
    ^
43: if ($_=~m/\/\//;
       ^
66: }
    ^
Found 1 extra '(' between '{' on line 11 and '}' on line 66
    The most recent un-matched '(' is on line 43
67: final indentation level: 1

Final nesting depth of '('s is 1
The most recent un-matched '(' is on line 43
43: if ($_=~m/\/\//;
       ^
67: To save a full .LOG file rerun with -g

After fixing the typos and making use of $transmem_str = "-" x $aa; to create the transmembrane string, followed by substr( $transmem_str, $start, $length, "M" x $length ); to replace - with M according to the start and end locations, your codes can be modifed to (tested with example from https://www.uniprot.org/uniprot/P81928.txt):

#!/usr/bin/perl
use strict;
use warnings;

open( IN, "<P81928.txt" ) or die "i can not open the transmem_proteins.swiss, $!";
open( OUT, ">askisi1.txt" );

my ( $id, $ac, $aa, $seq, $transmem_str ) = ( "", "", "", "", "" );
my @tr_table = ();
while (<IN>) {
    if ( $_ =~ m/^ID\s{3}(.+?)\s+/ ) {
        $id = $1;
    }

    if ( $_ =~ m/^AC\s{3}(.+?);/ ) {
        $ac = $1;
    }

    if ( $_ =~ m/^FT\s{3}TRANSMEM\s+(\d+)\s+(\d+)\s+/ ) {
        my ( $start, $end ) = ( $1 - 1, $2 - 1 );
        my $length = $end - $start + 1;
        push @tr_table, [ $start, $length ];
    }

    if ( $_ =~ m/^SQ\s{3}SEQUENCE\s{3}(\d+)\sAA;/ ) {
        $aa           = $1;
        $transmem_str = "-" x $aa;
    }

    if ( $_ =~ m/^\s{5}(.*)\n/ ) {
        $seq = $seq . $1;
        $seq =~ s/\s//g;
    }

    if ( $_ =~ /^\/\// ) {
        foreach my $tr (@tr_table) {
            my ( $start, $length ) = ( $tr->[0], $tr->[1] );
            substr( $transmem_str, $start, $length, "M" x $length );
        }

        print OUT ">$id\|$ac\|$aa" . "aa\n";
        print OUT "$seq\n";
        print OUT "$transmem_str\n";
        print OUT "//\n";

        ( $id, $ac, $aa, $seq, $transmem_str ) = ( "", "", "", "", "" );
        @tr_table = ();
    }
}

It returns:

$ cat askisi1.txt
>140U_DROME|P81928|261aa
MNFLWKGRRFLIAGILPTFEGAADEIVDKENKTYKAFLASKPPEETGLERLKQMFTIDEFGSISSELNSVYQAGFLGFLIGAIYGGVTQSRVAYMNFMENNQATAFKSHFDAKKKLQDQFTVNFAKGGFKWGWRVGLFTTSYFGIITCMSVYRGKSSIYEYLAAGSITGSLYKVSLGLRGMAAGGIIGGFLGGVAGVTSLLLMKASGTSMEEVRYWQYKWRLDRDENIQQAFKKLTEDENPELFKAHDEKTSEHVSLDTIK
------------------------------------------------------------------MMMMMMMMMMMMMMMMMMMMM-------------------------------------------MMMMMMMMMMMMMMMMMMMMM-------------------------------MMMMMMMMMMMMMMMMMMMMM----------------------------------------------------------
//
ADD COMMENT
0
Entering edit mode

Thank you so much! You saved my day :D. I am trying to learn Perl, but it's all greek to me.

ADD REPLY

Login before adding your answer.

Traffic: 2694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6