How Do I Convert 454 Ace To A Regular Ace?
3
3
Entering edit mode
14.0 years ago
Lee Katz ★ 3.2k

I have read on the BioPerl site that a 454 ace is not standardized due to its coordinate system. How can I convert it to the standard ace file?

When I run this code either by using contig or assembly objects, I get an error.

sub _newblerAceToAce($args){
  my($self,$args)=@_;
  my $ace454=Bio::Assembly::IO->new(-file=>$$args{ace454Path},-format=>"ace",-variant=>'454');
  my $ace=Bio::Assembly::IO->new(-file=>">$$args{acePath}",-format=>"ace");
  #while(my $contig=$ace454->next_contig){
  while(my $scaffold=$ace454->next_assembly){
    print Dumper $scaffold;
  }
  return $$args{acePath};
}

Can't call method "get_consensus_sequence" on an undefined value at Bio/Assembly/IO/ace.pm line 280, <GEN0> line 93349.

Further details:

From the bioperl site, The ACE files produced by the 454 GS Assembler (Newbler) do not conform to the reference ACE format. In 454 ACE, the consensus sequence reported covers only its clear range and the start of the clear range consensus is defined as position 1. Consequently, aligned reads in the contig can have negative positions. Be sure to use the '454' variant to have positive alignment positions. No attempt is made to construct the missing part of the consensus sequence (beyond the clear range) based on the underlying reads in the contig. Instead the ends of the consensus are simply padded with the gap character '-'.

bioperl assembly conversion • 3.3k views
ADD COMMENT
1
Entering edit mode
14.0 years ago
Lee Katz ★ 3.2k

I sent this to the bioperl mailing list on November 22. No response has been made yet. Now the problem is that Assembly::IO::ace::next_contig() probably takes about two days (very slow!). I have not gotten far enough to figure out why.

Assembly::IO::ace.pm: I changed a regular expression on line 231 because the contig object was not initializing properly. For some reason the 454 ace file had adopted the reference assembly's ID and therefore there was a GI number followed by a pipe. The pipe was not captured with w+. I think that the regex will be safe with s(S+)s.

if (/^CO\s(\S+)\s(\d+)\s(\d+)\s(\d+)\s(\w+)/xms) # New contig starts!
#if (/^CO\s(\w+)\s(\d+)\s(\d+)\s(\d+)\s(\w+)/xms) # New contig starts!
  
ADD COMMENT
0
Entering edit mode

Why not split the ACE file on every CO? That should be a quick operation, and if conversion is slow, at least you should be able to convert each contig in parallel.

ADD REPLY
0
Entering edit mode

Looks like the author of the module fixed it all. Looking forward to the next version of BioPerl (or I guess someone could just get the newest from source control)

ADD REPLY

Login before adding your answer.

Traffic: 2283 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6