How To Convert A Plain Squence In A Perl String Variable To Fasta Format
2
0
Entering edit mode
11.6 years ago
Curious Mind ▴ 10

Hi,

I have a couple of plain sequences (without any formatting and annotations) saved in sqlite3 database. After reading them into Perl strings, I have to convert them into fasta format using Perl. I see methods in Bio::SeqIO to convert one file format to another. But my sequences are in perl string variables and not in files.

Thanks

bioperl perl fasta • 4.6k views
ADD COMMENT
3
Entering edit mode
11.6 years ago

I don't see why a library is necessarily helpful here? Fasta format is super simple, so all you'd need to do is come up with some IDs, then print something like: print ">$NewId\n$sequence\n";

I'd be interested to know why people use libraries in this context, as I might be missing something.

ADD COMMENT
2
Entering edit mode

Quite right; printing out Fasta is simple without libraries (you might want to include a line wrap since sequence lines are in principle not supposed to exceed 80 characters). I just explained the Bioperl solution because the OP was using Bio::SeqIO and seemed confused by it.

ADD REPLY
0
Entering edit mode

A simple line wrapping can be obtained using unpack:

print '>', $seqId, "\n"; foreach $seqLine (unpack('(a[60])*', $seqStr)) { print $seqLine, "\n"; }

ADD REPLY
1
Entering edit mode

I cannot see any reason why a library would be useful. This solution is simple, fast and scalable.

ADD REPLY
1
Entering edit mode
11.6 years ago
Neilfws 49k

The Bio::SeqIO HOWTO might be helpful.

To create sequences using Bioperl requires Bio::Seq in addition to Bio::SeqIO.

Assuming that your sequence string is $string and you want to write to file myFile.fa:

#!/usr/bin/perl -w

use strict;
use Bio::SeqIO;
use Bio::Seq;

my $string   = "acaaaatcttgagagatt";
my $seq      = Bio::Seq->new(-display_id => "mySeq1", -seq => $string);
my $outseq   = Bio::SeqIO->new(-format => "fasta", -file => ">myFile.fa");

$outseq->write_seq($seq);

Obviously without annotation, you will have to devise a sensible method to generate sequence IDs for the Fasta header.

ADD COMMENT

Login before adding your answer.

Traffic: 2447 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6