Entering edit mode
8.1 years ago
Stephane Plaisance
▴
460
I used a simplistic script to convert GB to EMBL format and manual inspection reveals strange behaviour
#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;
if (@ARGV != 2) { die "USAGE: gb2embl.pl <GB_file> <EMBL_name>\n"; }
my $seqio = Bio::SeqIO->new('-format' => 'genbank', '-file' => "$ARGV[0]");
my $seqout = new Bio::SeqIO('-format' => 'embl', '-file' => ">$ARGV[1]");
while( my $seq = $seqio->next_seq) {
$seqout->write_seq($seq)
}
a piece of input GB formatted
mRNA complement(join(280..5840,5989..6007))
/gene="YRF1-7"
/locus_tag="YPL283C"
/gene_synonym="YRF1"
/product="Y' element ATP-dependent helicase protein 1 copy
7"
the same piece in the embl output
FT mRNA join(complement(280..5840),complement(5989..6007))
FT /locus_tag="YPL283C"
FT /gene_synonym="YRF1"
FT /gene="YRF1-7"
FT /product="Y' element ATP-dependent helicase protein 1 copy
FT 7"
I do not think that this is OK
complement( join ( -1-> -2-> ) ) = <-2- <-1-
join( complement( -1-> ) complement ( -2-> ) ) = <-1- <-2-
the order gets inverted by this logic, leading to chaotic sequence fusion
Anybody agrees and/or has comments?
Bioperl used: 1.007001
you have a point, I did not consider this. thanks for your clarification!
Hi Stephane,
Please use
ADD COMMENT
to reply to earlier answers, as such this thread remains logically structured and easy to follow. Thanks!Cheers,
Wouter