I have more than 5000 fasta sequence in a file and want to add a word , for instance phosphate, to header of all sequence. please tell me a PERL solution for that.
I have more than 5000 fasta sequence in a file and want to add a word , for instance phosphate, to header of all sequence. please tell me a PERL solution for that.
inplace:
perl -pi -e "s/^>/>phosphate-/g" your.fasta
or new file:
perl -p -e "s/^>/>phosphate-/g" your.fasta > phosphate.fasta
to add it to the end, use this regexp
's/^(>.*)$/$1-phosphate/g'
An easy way with sed:
sed 's/>.*/&_phosphate/' foo.in >bar.out
A faster option, from the BBMap package:
bbrename.sh in=file.fasta out=renamed.fasta prefix=phosphate addprefix=t
I know there are lots of option and it can be easily done with many unix one liners, but here is another alternative (my favorite).
bioawk -c fastx
'{ print ">PREFIX"$name; $seq }'
input.fasta
bioawk -c fastx
'{ print ">"$name"|SUFFIX"; $seq }'
input.fasta
Hi, I'm not sure I understand the (bio)awk syntax, but your command was not working for me (did not print sequences)...I put there a new line instead of a semicolon:
bioawk -c fastx '{ print ">PREFIX" $name "\n" $seq }' input.txt >outupt.txt
which seems to work. Anyway thanks for pointing me towards the solution.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
THANKS brentp, but i want to in the last of my header..is there any trick
@palu I edited my answer, see the last line.
thank you very much sir. for any layman person like me. final code will be like that
perl -p -e "s/^(>.*)$/$1-phosphate/g" your.fasta > phosphate.fasta
except you should use single quotes: perl -p -e 's/^(>.*)$/$1-phosphate/g' in.fasta > out.fasta
palu, if you like Brent's answer the best, you should select it as such (hover over the votes to do that).
@newlife thanks i do that