trim fasta header, remove after hash (#)
1
0
Entering edit mode
2.5 years ago
mthm ▴ 50

this is my fasta file

>BEL-3_Dvir#LINE/BelPao
GTAATTCTTGTGGTAGTTTTATGTTCTATGCAGGTTCAATATCGTACGCGCTCTGTGATG
CCCGCGCAGTACGCGTTTGTTTCCGATAGTTGATAACAAGTAATCGGTACAAATCGATAT
>P-2_Dmel#LINE/P
AATCTCTATATATAAAACTGTTTGTCCTGACTGACTGACTGACTGACTGACTGACTGACT
GACTGATTGGTGATCAACGCACAGCCCAAACCGTAAGAGCTAGGAAGCTGAAATTTTCAC
TGTAGCTACCTTATGTGATGTAGGTGCACGTTAAGACGGGGTTTCGGGAAATTCCACCCG

I want to remove the header after the hash:

>BEL-3_Dvi
GTAATTCTTGTGGTAGTTTTATGTTCTATGCAGGTTCAATATCGTACGCGCTCTGTGATG
CCCGCGCAGTACGCGTTTGTTTCCGATAGTTGATAACAAGTAATCGGTACAAATCGATAT
>P-2_Dme
AATCTCTATATATAAAACTGTTTGTCCTGACTGACTGACTGACTGACTGACTGACTGACT
GACTGATTGGTGATCAACGCACAGCCCAAACCGTAAGAGCTAGGAAGCTGAAATTTTCAC
TGTAGCTACCTTATGTGATGTAGGTGCACGTTAAGACGGGGTTTCGGGAAATTCCACCCG

I tried this code

sed -r -i 's/^>/.*#/\1/' file.fasta

the error is

 -e expression #1, char 10: unknown option to `s'

how to fix this?

header fasta hash trim • 672 views
ADD COMMENT
1
Entering edit mode

also, be extremely careful when using the -i option ( the in-file modification parameter) , if something goes awol you lost your input file. Always work on a copy of the original file or send the sed output to a new file.

ADD REPLY
4
Entering edit mode
2.5 years ago
$ awk '/^>/ {sub(/#.*$/,"",$0)}1' test.fa
$ awk -F "#" '{print $1}' test.fa
$ sed -r '/^>/ s/#.*//' test.fa
$ cut -d '#' -f 1 test.fa

Code you pasted has several issues. Never use -i if you are not sure of code.

ADD COMMENT

Login before adding your answer.

Traffic: 2000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6