Hi, I need help. I do not have too much experience in coding. I am trying to replace selenocysteine (U) in a phylip file with X. How can I do that? I tried to use Bioperl but still not successful. My trial is written below.
#!/usr/bin/perl
use Getopt::Std;
use Bio::Seq;
use Bio::SeqIO;
use Bio::AlignIO;
use Bio::SimpleAlign;
my %opts = ();
getopts ('f:', \%opts);
my $file = $opts{'f'};
my $alnin = Bio::AlignIO->new (-format=>'phylip', -file=>"$file");
my $alnout = Bio::AlignIO->new (-format=>'phylip', -file=>"$file");
while (my $aln = $alnin->next_aln()){
my $id = $aln->id_linebreak();
my $seq = $aln->seq();
$seq =~s/U/X/g;
$seq =~s/O/X/g;
print "$id\n$seq\n";
}
If it's simple replacement, you can try
sed 's/[U,O]/X/g' test.txt
But this will replace any possibility of U or O in sequence header. e.g. a sequence with name AOUxyz. So if you can modify, then it world be great.
I agree with that @pb. However, I am looking for example input from OP. I think most of the new bie's posts do not post data, expected output and error. If i can catch hold of example file with selenocysteine in phylip format, then I can work it out.
Hello ahmedmagds,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Hello ahmedmagds,
could you please post an example of your input and desired output? What do you mean by "I tried to use Bioperl but still not successful"? And is perl mandatory or are other solutions fine as well?
fin swimmer
Thanks everyone for the help! Sorry for the bad format and not posting a file example. The sed command is easy to use and quick but the problem that sometimes you have U/O in the names which happens one time in the current file but in other examples would be more. so I will put small part of the file here so we can discuss more:
Those can be done away with awk and sed. @OP