Entering edit mode
9.5 years ago
anp375
▴
190
Hi,
I'm using this module: http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Ontology/GOterm.html
for a project and I'm supposed to create a new GOTerm object and pass in a GO_Id. I'm trying to print out the description of the GOTerm with to_String but every single description is empty. Under "obsolete" a 0 is printed. How do I fix this? The GOTerms are in the right format.
Thank you
#!/usr/bin/perl -w
use Bio::Ontology::GOterm;
use strict;
# Read in ccds file for parsing
my $in = "";
# Use subroutines parse file and send gene
# names to Bio::GO module to get the go_id,
# definition
GeneList('1');# For chromosome 1. If this was a method, $in would be passed in. I'm passing in a specification for chromosome 1.
############### Subs #########################
sub GeneList{ # Pass in the chromosomes you want to check.
my @chromosomes = @_;
my %genelist; # This gives a nonredundant gene list. The keys are the gene IDs or names and the
# values are references to the arrays of other information.
# Only the last gene entry for a certain gene will be stored. Length of that sequence
# may differ from the other gene entries though.
open(IN, $in)|| die "Error: $!\n";
my $out = ">";
open(OUT, $out);
my @IN2 = split(/\r/,<IN>); # <IN> sends back a giant string because lines are delimited by \r instead of \n
foreach(@IN2){
my @fields = split(/\t/);
# Skip "withdrawn" in ccds_status field, check if the first line is skipped
if($fields[5] ne 'Public') {next};
if (!grep(/^$fields[0]$/, @chromosomes)) {next};
# Remove redundancy in the list. If gene name exist more than once,
# then we are to remove the extra names.
$genelist{$fields[3]} = \@fields; # The above lines already avoid the withdrawn sequences and
# sequences not on chromosome 1.
}
foreach my $gene(keys %genelist){ # after redundancy is removed
foreach my $arr(@{GOTerms($gene, $genelist{$gene})}){ # print out each item in the returned list
if($arr){
print OUT $arr;
}
}
print OUT "\n"; # then make sure there is a nextline
}
close(OUT);
my $ref = \%genelist;
return $ref;
}
sub GOTerms{ # This will take one ID at a time, along with an array reference.
my $ID = $_[0];
my @array = @{$_[1]};
if(length($ID) > 7){$ID = '0000000';} # GO numbers maximum 7 digits
my $zeros = '0'x(7 - length($ID)); # to make the ID 7 characters long
# create a go terms object
my $go_term = Bio::Ontology::GOterm->new(-go_id => "GO:$zeros$ID");# Can -go_id => "GO:$ID" work? Or does it need all arguments?
# implement the methods Go_id and to_string of this object to
# return a list that is formatted like so:
# chr#\tgene_name\tGO_id\tGOTerm
# chr1 stat1 GO:0003947 The term description
#$go_term->GO_id("GO:$zeros$ID");# This should take in "GO_REF:$zeros$ID"
print $go_term->GO_id()."\n";
my @gene = ("chr$array[0]", "\t", $array[2], "\t", $go_term->GO_id(), "\t", $go_term->to_string());
return \@gene;
}
It doesn't let me print out
$arr[0]
. I return a reference to an array, declare it as a reference to an array and print it to the file. A sample output is:So it prints out some things but doesn't print out any information. under Is obsolete, it prints a 0, and does this for each GO Term
Sorry, yes your code is correct. In fact everything is fine. You create a new GO and you set up only an id attribute. So when you print it everything is empty. I guess you want rather retrieve the information linked to this GO id. To do that, you are not using the right method. You have to check if one method to do that exists.
Look at: http://search.cpan.org/dist/BioPerl/Bio/Ontology/OBOEngine.pm
You will need the gene ontology definition file (http://www.geneontology.org/ontology/gene_ontology.obo)
Thank you! I will try this. I assumed the descriptions were pulled off the internet by one of the other modules that GOTerm uses