Fasta Conversion
3
Hi there,
Is there away change identifiers in a fasta file.
For examplt from
>fastsdde135667667
actgcagtctga
>fgdte12875
actggact
to
>Seq1
actgcagtctga
>Seq2
actggact
fasta
• 2.2k views
•
link
updated 13.0 years ago by
Daniel
★
4.0k
•
written 13.0 years ago by
Syawash
▴
30
use awk:
awk '/^>/ { printf(">Seq%d\n",(++i)); next;} { print }' < input.fa > output.fa
Ex:
echo ">fastsdde135667667
actgcagtctga
>fgdte12875
actggact" | awk '/^>/ { printf(">Seq%d\n",(++i)); next;} { print }'
>Seq1
actgcagtctga
>Seq2
actggact
also, this:
#!/usr/bin/perl
$count =1;
while (<>){
if (s/^>.*/>Seq$count/){;
$count++;
}
print;
}
>Seq1
actgcagtctga
>Seq2
actggact
With Biopieces and add_ident:
read_fasta -i input.fa | add_ident -k SEQ_NAME -p Seq | write_fasta -o output.fa -x
Login before adding your answer.
Traffic: 1792 users visited in the last hour
Hi Pierre. Just curious if you can add padding with zeroes simply with awk. Eg: seq1 --> seq0001, seq253 --> seq0253. Happy New Year :)
@Eric, yes that works like the std C printf: printf(">Seq%03d\n",(++i))
@Pierre, nice! Thanks. Have to learn more C and C++ some time.