Hi everybody,
I am trying to convert the sequences in a fasta file from the interleaved format to a sequential format
My input:
>gi|161085638|dbj|AB305033.1|
ATATGCCTGAAAGTGGCGGACGGGTGAGTAACACGTGGGTGACCTGCCTCGGAGTGGGGGATAACCATGG
GAAACTGTGGCTAATACCGCATGGGCTTGTTGGCTTTGGCGGCCAACGAGTAAAGCTTTAGTGCTTCGAG
AGGGGCCTGCGTCCGATTAGGTAGTTGGTGAGGTAATGGCTCACCAAGCCGATGATCGGTAGCTGGTCTG
>gi|161085638|dbj|AB305644.1|
ATATGCCTGAAAGTGGCGGACGGGTGAGTAACACGTGGGTGACCTGCCTCGGAGTGGGGGATAACCATGG
GAAACTGTGGCTAATACCGCATGGGCTTGTTGGCTTTGGCGGCCAACGAGTAAAGCTTTAGTGCTTCGAG
AGGGGCCTGCGTCCGATTAGGTAGTTGGTGAGGTAATGGCTCACCAAGCCGATGATCGGTAGCTGGTCTG
Desired output:
>gi|161085638|dbj|AB305033.1|
ATATGCCTGAAAGTGGCGGACGGGTGAGTAACACGTGGGTGACCTGCCTCGGAGTGGGGGATAACCATGGGAAACTGCGGCCAACGAGTAAAGCTTTAGTGCTTC...
>gi|161085638|dbj|AB305644.1|
ATATGCCTGAAAGTGGCGGACGGGTGAGTAACACGTGGGTGACCTGCCTCGGAGTGGGGGATAACCATGGGGCTAATACCGCATGGGCTTGTTGGCTTTGGCGGC...
After unsuccessfully trying to compile a script myself, I have found the following on http://phototrophic.net/node/37:
#!/usr/bin/perl
$in = open(IN,"<file.fasta");while ($in=<IN>){chomp $in;if ($in=~m/^>/) { print "\n",$in,"\n";}else{print $in;}}
But when I try to use this I get bashed:
bash: syntax error near unexpected token `in'
If anyone can provide an answer why I get this syntax error, or can help me with a script to convert interleaved files to sequential files, that would be greatly appreciated
Best,
Sam
Thanks, will try this
Excellent use of
Bio::SeqIO
.