Remove double vertical bar "||" at the end of the sequence in a fasta file
4
1
Entering edit mode
5.5 years ago
AP ▴ 80

Hello I have a fasta file as

>jgi|FusspF23_1|104542
GGTTCTCTTGTCTGTTTTTAAAAGAGTCTCGAGACCCC||
>jgi|FusspF23_1|10121
GGGGCCGCCCTCGATTCATCAGCGAAATTGCTCAGTCGGGCG||

at the end of fasta sequence I have these symbols "||" which I want to remove. I tried using sed but in that way I will end up removing that symbol "|" in fasta sequence description line ">jgi|FusspF23_1|104542" which I don't want.

Please help.

Thank you, Ambika

awk sed fasta • 1.4k views
ADD COMMENT
6
Entering edit mode
5.5 years ago
sed 's/[\|]*$//' in.fasta
ADD COMMENT
2
Entering edit mode

We could also have sed match the first character of a line and replace only when the line does not start with a >:

sed '/^[^>]/s/|//g' in.fasta
ADD REPLY
0
Entering edit mode

Thanks you so much it works.

ADD REPLY
0
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.
Upvote|Bookmark|Accept

ADD REPLY
4
Entering edit mode
5.5 years ago
ATpoint 86k
awk '{gsub("\\|\\|","");print}' in.fasta
ADD COMMENT
4
Entering edit mode
5.5 years ago
Joe 21k
cat file.fasta | tr -d '||' > fixed.fasta
ADD COMMENT
4
Entering edit mode
5.5 years ago
AK ★ 2.2k
perl -pe 's/\|+$//' in.fasta
ADD COMMENT

Login before adding your answer.

Traffic: 1831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6