Question

How To Add Specific Word To Fasta Header

2

Entering edit mode

13.2 years ago

Palu ▴ 290

I have more than 5000 fasta sequence in a file and want to add a word , for instance phosphate, to header of all sequence. please tell me a PERL solution for that.

fasta • 18k views

ADD COMMENT • link updated 3.1 years ago by O.rka ▴ 740 • written 13.2 years ago by Palu ▴ 290

score 7 · Answer 1 · 2011-08-31

7

Entering edit mode

13.2 years ago

brentp 24k

inplace:

    perl -pi -e "s/^>/>phosphate-/g" your.fasta

or new file:

    perl -p -e "s/^>/>phosphate-/g" your.fasta > phosphate.fasta

to add it to the end, use this regexp

    's/^(>.*)$/$1-phosphate/g'

ADD COMMENT • link 13.2 years ago by brentp 24k

0

Entering edit mode

THANKS brentp, but i want to in the last of my header..is there any trick

ADD REPLY • link 13.2 years ago by Palu ▴ 290

0

Entering edit mode

@palu I edited my answer, see the last line.

ADD REPLY • link 13.2 years ago by brentp 24k

0

Entering edit mode

thank you very much sir. for any layman person like me. final code will be like that

perl -p -e "s/^(>.*)$/$1-phosphate/g" your.fasta > phosphate.fasta

ADD REPLY • link 13.2 years ago by Palu ▴ 290

0

Entering edit mode

except you should use single quotes: perl -p -e 's/^(>.*)$/$1-phosphate/g' in.fasta > out.fasta

ADD REPLY • link 13.2 years ago by brentp 24k

0

Entering edit mode

palu, if you like Brent's answer the best, you should select it as such (hover over the votes to do that).

ADD REPLY • link 13.2 years ago by Neilfws 49k

0

Entering edit mode

@newlife thanks i do that

ADD REPLY • link 13.2 years ago by Palu ▴ 290

score 4 · Answer 2 · 2011-08-31

4

Entering edit mode

13.2 years ago

Daniel ★ 4.0k

An easy way with sed:

sed 's/>.*/&_phosphate/' foo.in >bar.out

ADD COMMENT • link 13.2 years ago by Daniel ★ 4.0k

score 2 · Answer 3 · 2015-05-30

2

Entering edit mode

9.5 years ago

Brian Bushnell 20k

A faster option, from the BBMap package:

bbrename.sh in=file.fasta out=renamed.fasta prefix=phosphate addprefix=t

ADD COMMENT • link 9.5 years ago by Brian Bushnell 20k

0

Entering edit mode

Is there a way to get this to work with protein fasta?

ADD REPLY • link 3.1 years ago by O.rka ▴ 740

score 0 · Answer 4 · 2015-05-30

0

Entering edit mode

9.5 years ago

arnstrm ★ 1.9k

I know there are lots of option and it can be easily done with many unix one liners, but here is another alternative (my favorite).

bioawk -c fastx '{ print ">PREFIX"$name; $seq }' input.fasta
bioawk -c fastx '{ print ">"$name"|SUFFIX"; $seq }' input.fasta

ADD COMMENT • link 9.5 years ago by arnstrm ★ 1.9k

0

Entering edit mode

Hi, I'm not sure I understand the (bio)awk syntax, but your command was not working for me (did not print sequences)...I put there a new line instead of a semicolon:

 bioawk -c fastx '{ print ">PREFIX" $name "\n" $seq }' input.txt >outupt.txt

which seems to work. Anyway thanks for pointing me towards the solution.

ADD REPLY • link 8.2 years ago by al-ash ▴ 210