Convert FASTQ to SCF using BioPerl
1
1
Entering edit mode
9.2 years ago
DaniCee ▴ 20

Hello everyone,

I want to use Phred and Phrap starting with a FASTQ file, as I posted in Phred/Phrap pipeline starting with FASTA file of paired-end reads and using a reference sequence which is still not solved.

I understand Phred needs a chromatogram file in SCF or ABI format as input. Hence, I need to convert my FASTQ file into SCF format, but I haven't found any converter around. Does anybody know any converter?

In the meantime, I am trying to convert the FASTQ file into SCF format using BioPerl, as suggested in Converting A Dna Sequence To Abi Or Scf Format

The BioPerl code provided in that thread does not work. I am on a Ubuntu machine and have already installed libstaden-read1 and staden-io-lib-utils (http://staden.sourceforge.net/), which support various DNA sequence read formats, in particular SCF, ABI, ALF, CTF, ZTR, SFF and SRF. Still cannot make it work.

After studying the following BioPerl documentation pages:

I came up with this Minimal Example, but I think it is still not working. Though I specify an output file to get my stuff printed on, I get some weird stuff printed on screen; I know SCF is a binary file, but is it what I get on screen my output SCF file? Cause the output SCF file I specify (myfile.scf) doesn't look good at all.

#!/usr/bin/perl -w
use warnings;
use strict;
use Bio::SeqIO;
#use Bio::Seq::Quality;

my $infile="myfile.fastq";
my $outfile="myfile.scf";

my $seqio_in=Bio::SeqIO->new(-file => $infile, -format => "fastq");# simple 'fastq' format defaults to 'sanger' variant
my $seqio_out=Bio::SeqIO->new(-file => ">$outfile", -format => "scf"); ## or 'abi'

# write each entry in the input file to the output file
# $seq_obj is a Bio::Seq::Quality object
while (my $seq_obj=$seqio_in->next_seq) {
    #this does not work
    #~ $seqio_out->write_seq($seq_obj);

    #this prints binary on screen instead of the file... is that my scf file on screen?
    $seqio_out->write_seq(
                -target => $seq_obj,
                -version => 2,
                -CONV => "Bioperl-Chads Mighty SCF writer.");

    #the documentation suggests to pass a SequenceTrace object to write_seq instead...
    #this prints the same binary stuff on screen instead of the file... is that my scf file on screen?
    #~ my $st = Bio::Seq::SequenceTrace->new(
            #~ -swq => $seq_obj);
    #~ $seqio_out->write_seq(
                #~ -target => $st,
                #~ -version => 2,
                #~ -CONV => "Bioperl-Chads Mighty SCF writer.");
}

I really need some help here. Many many thanks!

phred scf fastq bioperl • 3.5k views
ADD COMMENT
1
Entering edit mode
9.2 years ago

Check this out

#!/usr/bin/perl -w

eval 'exec /usr/bin/perl -w
-S $0 ${1+"$@"}'
    if 0; # not running under some shell
# $Id: seqconvert.PLS,v 1.3 2004/03/09 19:57:32 cjm Exp $

use strict;
use Getopt::Long;
use Bio::SeqIO;

my $help;
my $from=undef;
my $to=undef;

### please add to this list (see the modules under Bio/SeqIO):
my @known_formats=
  qw(gcg fasta ace raw fastq phd pir scf swiss genbank locuslink
     embl game qual bsml tab raw abi chado alf ctf exp ztr pln);

my $script=substr($0, 1+rindex($0,'/'));
my $usage="Usage:

  $script --from in-format --to out-format < file.in-format > file.out-format

Known formats:\n  " . join(' ', @known_formats) . "\n\n";

die $usage unless
  &GetOptions( 'from:s'   => \$from,
               'to:s'     => \$to,
               'h|help'   => \$help
      )
  && !$help &&  $from && $to
  && grep($from eq $_, @known_formats)
  && grep($to eq $_, @known_formats);

my $in  = Bio::SeqIO->newFh(-fh => \*STDIN , '-format' => $from);
my $out = Bio::SeqIO->newFh(-fh=> \*STDOUT, '-format' => $to);

print $out $_ while <$in>;

__END__

=head1 NAME

seqconvert - generic BioPerl sequence format converter

=head1 SYNOPSIS

  seqconvert --from in-format --to out-format < file.in-format > file.out-format
  # or
  seqconvert -f in-format -t out-format < file.in-format > file.out-format

=head1 DESCRIPTION

This script gives command line interface to BioPerl Bio::SeqIO.

=head1 SEE ALSO

L<Bio::SeqIO>

=head1 FEEDBACK

=head2 Mailing Lists

User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to
the Bioperl mailing list.  Your participation is much appreciated.

  bioperl-l@bioperl.org              - General discussion
  http://bioperl.org/MailList.shtml  - About the mailing lists

=head2 Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track
of the bugs and their resolution. Bug reports can be submitted via
email or the web:

  bioperl-bugs@bioperl.org
  http://bugzilla.bioperl.org/

=head1 AUTHOR - Philip Lijnzaad

Email p.lijnzaad@med.uu.nl

=cut
ADD COMMENT
0
Entering edit mode

Not working. Same problem I reported on Converting A Dna Sequence To Abi Or Scf Format and I tried to tackle with my code. I get the following error:

PROMPT> perl convert.pl --from fastq --to scf < myfile.fastq > myfile.scf

Reference found where even-sized list expected at /usr/share/perl5/Bio/SeqIO/scf.pm line 610, <STDIN> line 5.

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: You must pass a Bio::Seq::Quality or a Bio::Seq::SequenceTrace object to write_seq as a parameter named "target"
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:486
STACK: Bio::SeqIO::scf::write_seq /usr/share/perl5/Bio/SeqIO/scf.pm:625
STACK: Bio::SeqIO::PRINT /usr/share/perl5/Bio/SeqIO.pm:733
STACK: convert.pl:40
-----------------------------------------------------------
ADD REPLY
0
Entering edit mode

Is that working on your computer?

ADD REPLY
0
Entering edit mode

Please, the above solution DOES NOT WORK. Can anybody help with this?

ADD REPLY

Login before adding your answer.

Traffic: 1850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6