Hi, I need help. I do not have too much experience in coding. I am trying to replace selenocysteine (U) in a phylip file with X. How can I do that? I tried to use Bioperl but still not successful. My trial is written below.
use Getopt::Std;
use Bio::Seq;
use Bio::SeqIO;
use Bio::AlignIO;
use Bio::SimpleAlign;
my %opts = ();
getopts ('f:', \%opts);
my $file = $opts{'f'};
my $alnin = Bio::AlignIO->new (-format=>'phylip', -file=>"$file");
my $alnout = Bio::AlignIO->new (-format=>'phylip', -file=>"$file");
while (my $aln = $alnin->next_aln()){
my $id = $aln->id_linebreak();
my $seq = $aln->seq();
$seq =~s/U/X/g;
$seq =~s/O/X/g;
print "$id\n$seq\n";
If it's simple replacement, you can try
sed 's/[U,O]/X/g' test.txt
But this will replace any possibility of U or O in sequence header. e.g. a sequence with name AOUxyz. So if you can modify, then it world be great.
I agree with that @pb. However, I am looking for example input from OP. I think most of the new bie's posts do not post data, expected output and error. If i can catch hold of example file with selenocysteine in phylip format, then I can work it out.
Thanks everyone for the help! Sorry for the bad format and not posting a file example. The sed command is easy to use and quick but the problem that sometimes you have U/O in the names which happens one time in the current file but in other examples would be more. so I will put small part of the file here so we can discuss more:
Those can be done away with awk and sed. @OP