Bioperl gff3 : How to print non gff3 feature as "###" using Bio::Tools::GFF
2
0
Entering edit mode
8.4 years ago
Juke34 8.9k

Hi everybody,

It is not the first time I meet the issue. GFF3 format allows to have lines that are not features like ###. But using Bio::Tools::GFF I don't find any method allowing to print that kind of non-feature lines.

Does someone know a way to perform that task ? I'm only interested in Perl solution.

Thank you !

gene genome • 2.8k views
ADD COMMENT
0
Entering edit mode

Could you write an example of what you'd like to see in the GFF3 file?

ADD REPLY
0
Entering edit mode

Something like:

ctg123 . gene               1000  9000  .  +  .  ID=gene00001;Name=EDEN
ctg123 . mRNA               1050  9000  .  +  .  ID=mRNA00001;Parent=gene00001;Name=EDEN.1
ctg123 . CDS                1201  1500  .  +  0  ID=cds00001;Parent=mRNA00001
ctg123 . CDS                3000  3902  .  +  0  ID=cds00001;Parent=mRNA00001
ctg123 . CDS                5000  5500  .  +  0  ID=cds00001;Parent=mRNA00001
ctg123 . CDS                7000  7600  .  +  0  ID=cds00001;Parent=mRNA00001
###
ctg123 . gene               11000  91000  .  +  .  ID=gene00002
ctg123 . mRNA               11050  91000  .  +  .  ID=mRNA00002;Parent=gene00002
ctg123 . CDS                11201  11500  .  +  0  ID=cds00002;Parent=mRNA00002
ctg123 . CDS                31000  31902  .  +  0  ID=cds00002;Parent=mRNA00002
ADD REPLY
0
Entering edit mode

It seems Bio::Tools::GFF can only output features. The solution is to output the non-feature lines yourself. If you want to use forward reference closing (###), you could output relevant features with Bio::Tools::GFF then output ### then output the next batch of features and so on. If that's not convenient, you could output all features then insert the non-feature lines. You would need to keep track of where they should go in the file. Also you could do this in memory before actually printing everything to file.

ADD REPLY
0
Entering edit mode

I wanted to print directly in a file, but to put everything in memory before to print it seems to be the only convenient solution. I'm a bit worry because my files are over 2gb ... so I hope my computer can digest it easily.

ADD REPLY
2
Entering edit mode
8.4 years ago
Lee Katz ★ 3.2k

Here is the code that actually works after I did some digging. It's a little bit of a hack but I don't believe it will cause any problems. Also I changed it to Bio::FeatureIO which I think is a little more mainstream. In this one-liner, ### is printed to the output file after each input feature, but you can change it to however you want.

perl -MBio::FeatureIO -e '
  $in=Bio::FeatureIO->new(-file=>"in.gff"); 
  $out=Bio::FeatureIO->new(-file=>">tmp.gff"); 
  $fh=$out->{"_filehandle"}; 
  while($feat=$in->next_feature){
    $out->write_feature($feat); 
    print $fh "###\n";
  }
'

The method fh() actually performs some tie and Symbol::gensym magic which I found in the source code. This magic forces you to use a feature object when you write to the file.

sub fh {
  my $self = shift;
  my $class = ref($self) || $self;
  my $s = Symbol::gensym;
  tie $$s,$class,$self;
  return $s;
}

To get around that, I found that the filehandle is actually embedded in the object's variables, under _filehandle. Therefore you can exploit $out->{"_filehandle"} by printing to it directly.

ADD COMMENT
0
Entering edit mode

It's exactly what I was looking for !! Thank you so much !

ADD REPLY
0
Entering edit mode
8.4 years ago
Lee Katz ★ 3.2k

Jean-Karim is correct that the GFF module will probably only output features. However, you probably can manually write to the file in your perl script! There is a method called $obj->fh which returns the filehandle of your GFF object. Therefore you could write to the file as desired. For example:

$gffObj->write_feature($something);
# Get the filehandle
my $fh=$gffObj->fh;
# Print comments to the file
print $fh "###\n";
$gffObj->write_feature($something2);
$gffObj->close
ADD COMMENT
0
Entering edit mode

It would be nice, but it doesn't work. It as well done to print a feature object. The error is : Can't locate object method "strand" via package "### ...

I'm afraid that the only solution will be to work on bioperl to fix it.

ADD REPLY

Login before adding your answer.

Traffic: 2279 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6