Dear ALL,
I am trying to replace atypical aminoacid (one letter symbol like U), that is either leucine or isoleucine.
I have a long bacterial protein fasta-file with the single line headers and single line seqs.
I need to produce the modified fasta-file with the replaced atypical aminoacids and an additional file saying where they where situated before the substitution. I have a very strange result - probably I made a mistake in curly brace positions.
I am looking at this file for too long, I've not seen any obvious mistake.
Please, help me to find it! Thank you very much!
Sincerely yours,
Natasha
#!/bin/env perl
use strict;
use warnings;
my $file_name1;
my $file_name2;
my $file_name3;
my $file_name4;
my $line;
my @seq_line;
my @string;
$file_name1 = $ARGV[0];
$file_name2 = $ARGV[1];
$file_name3 = $ARGV[2];
$file_name4 = $ARGV[3];
open(IN1, "<".$file_name2) or die; # fasta-file with lines as headers and protein seqs
open(OUT1, ">".$file_name3) or die; # my file headers and seqpositions of the wrong aa symb
open(OUT2, ">".$file_name4) or die; #modified fasta –file - programs do not like atypical aas.
while(my $line=<IN1>) {
next if $line =~ /^\s$/; # skip empty line
# next if $line =~ /^>/;
# I have to print this line to my output file
if($line =~ /^>/) {
print OUT1 $line;
# print OUT2 $line."\n";
# next;
}
# split the whole seq into the aminoacids. No new-line symbol
my @seq_line = split(//, $line);
for(my $i=0; $i< scalar(@seq_line); $i++) {
if ($seq_line[$i] eq "U") {
$line=~s/U/L/;
print OUT1 "position".$i." .atypical.protsymbol"."_".$file_name3."\n";
}
elsif ($seq_line[$i] eq "u"){
$line=~s/u/l/;
print OUT1 "position ".$i." .atypical.protsymbol"."_".$file_name3."\n";
}
}
print OUT2 $line."\n";
}
next;
}
close(IN1);
close(OUT1);
close(OUT2);
Thank you very much, I will try. The position is the main point - I won't be able to find the particular L without it. And it will give me a respective header, am I right?