Dear Bioperl professionals am new to bioperl and i have managed to write this code from a beginners level. i want to add annotation value to my genbank file but the annotation is in a text file( tabular). e.g if the proteinid in the genbank matches the line of annotation in the text file it should append the value in the /notes of the CDS. please help me. Many thanks
#!/usr/bin/perl
use Bio::Seq;
use Bio::SeqIO;
use Bio::SeqFeature::Generic;
use IO::String;
use Bio::Perl;
open(INFILE, "testdata"); # this file is in tabular format its my annotation file
@annovalue= $line = <INFILE>; # I tried opening the file in bioperl couldn't so I used perl
$myfile = 'testgenbank.gbk'; # this is my genbank file I want to add annotation value to
my $seqio_obj = Bio::SeqIO ->new (-file => $myfile,-format=>'genbank');
while($annovalue ne "" )
{
for $feat_object ($seq_obj->get_SeqFeatures)
{
if ($feat_object->primary_tag eq "CDS")
{
if ($feat_object->has_tag('protein_id'))
{
for my $val ($feat_object->get_tag_values('protein_id'))
{
foreach $val(@feat_object)
{
if ($val=~ /$annovalue/)
{ # here am saying if the protein id in annotation file matches the tabular file
$feat_object -> $set_tag_values('notes'); # add the annotation to the notes in the CDS of that gene
$seq_obj = $seqio_obj -> $next_seq();
}
}
}
}
}
}
print OUTFILE ($line);
}
I tried to formatted your code for readability, but it looks with some missing brackets.
where please?
In your main loop, you open 7 brackets, you close only 4.
i have added the brackets yet still am certain sometime is still wrong with my syntax. its not working still
In addition to the original missing brackets, the indentation is not consistent. Sometimes you indent the line after an opening bracket, but sometimes you don't. These may seem like small things but they make a huge difference in someone's ability to understand what the code is doing.
Hi Daniel I have rewritten the code. I want to print the id of the fasta file which in my case is protein id and match it to the tag value which is proteinid of cds feature. if it matches then I want to add the sequence in fasta(wch in my case is protein function to the /notes tag.