Remove whitespaces on fasta files, except on fasta-header
0
0
Entering edit mode
3.3 years ago

Hey everyone,

I have a multi-fasta file like this:

>NC_000914 464618..534825
gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac
aaaacgccggccggcttgcgggaccatgcgatattacaactgctcgccacctacggactg aaaacgccggccggcttgcgggaccatgcgatattacaactgctcgccacctacggactg
cgatcaggagaaatccgcaacatgcggattgaggatatcgattggcggaccgaaaccatt cgatcaggagaaatccgcaacatgcggattgaggatatcgattggcggaccgaaaccatt

I would like to remove whitespaces from the fasta sequences, but keep the whitespaces on the fasta-headers (>). I use this command sed -i '/^>/ s/ .*//' file.fasta to remove whitespaces from fastaheaders, but now I want the opposite. Is this possible?

Thanks!

sequence • 1.7k views
ADD COMMENT
0
Entering edit mode

negate the headers in current command line. But be careful while using i and current commandline is not correct to remove only spaces in header.

ADD REPLY
0
Entering edit mode

Thank you for the reply. But how to negate that? Yes, I'll be careful with -i, thanks for the tip

ADD REPLY
1
Entering edit mode

/^>/! and current commandline removes space and any thing after that. Do not use that for this purpose. Try this: sed '/^>/! s/\s\+//g' test.fa

ADD REPLY

Login before adding your answer.

Traffic: 2342 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6