How to use the Bioperl to parse the parse flat file of UniProtKB database in order to get the function annotation of a protein ?
now, I can only fetch out the all comment of a protein, can someone help me ? I only want to fetch out the "FUNCTION".
my code :
#!/usr/bin/perl
use warnings;
use Bio::SeqIO;
use Bio::DB::SwissProt;
open (GENE, $ARGV[0]) or die "cannot open gene file:$!";
$db_obj = Bio::DB::SwissProt->new;
my @genes;
while(<GENE>){
chomp;
push @genes, $_;
}
$stream_seq = $db_obj->get_Stream_by_acc(["@genes"]);
my $i=0;
while ( my $seq_obj = $stream_seq->next_seq )
{
my $anno_collection = $seq_obj->annotation;
for my $key ( $anno_collection->get_all_annotation_keys )
{
my @annotations = $anno_collection->get_Annotations($key);
for my $value ( @annotations )
{
if ($value->tagname eq "comment") {
print "$genes[$i]:",$value->display_text,"\n";
}
}
}
$i++;
}
result:
Q8N349:-!- FUNCTION: Odorant receptor (Potential).
-!- SUBCELLULAR LOCATION: Cell membrane; Multi-pass membrane protein.
-!- SIMILARITY: Belongs to the G-protein coupled receptor 1 family.
-!- WEB RESOURCE: Name=Human Olfactory Receptor Data Exploratorium
(HORDE);
URL="http://bip.weizmann.ac.il/cgi-bin/HORDE/showGene.pl?key=symbol&value=OR2L13";
-----------------------------------------------------------------------
Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
Distributed under the Creative Commons Attribution-NoDerivs License
-----------------------------------------------------------------------
please, take a few minutes to correctly format your question.
Yes, please do. I made a start for you. You need to indent lines of code with 4 spaces and do not copy/paste tabs.