From Single Line To Fasta Format ?
4
2
Entering edit mode
12.7 years ago
Carla ▴ 60

Hi, I'm parsing some sequences and I need to transform them from a single line to fasta format... The format I have now is something like this:

Name_of_the_specie AGTAGAGATTGAGATGAGGAGATTCCCATAGAGTTCGAATAGCTGTAATAGAG
Name_of_the_specie2 AGGAGGTTTGAGAGTTTTGAGAGGTGGGGAGAGTTTTTTAAAGGGGGAAAGAAGGAGAATGAGAGGGTTGAG  
Name_of_the_ specie_3 ACATGCGATGAGGGCCCTTTATAATTTAATTGCGCTAAATAGATCTCCCTTTAGGGATATTGAGGAAGAGGAGGGATTAGGAGATTAGAGAGATAT

and so on... (>300)

I want to make a single file containing all those sequences in fasta format, if possible using perl/awk/sed.

Thanks!

fasta parsing perl awk • 6.3k views
ADD COMMENT
5
Entering edit mode
12.7 years ago
ngsgene ▴ 380

Assuming the name of the file doesn't have spaces, neither does the sequence and the new sequence starts in a new line - this should be straightforward

awk '{print ">"$1 "n" $2}' file_name > new_filename

the command would print > before the name ($1) and after a new line "n" you would get the sequence ($2) - which would be the fasta format

>Name_of_the_specie 
AGTAGAGATTGAGATGAGGAGATTCCCATAGAGTTCGAATAGCTGTAATAGAG 
>Name_of_the_specie2 
AGGAGGTTTGAGAGTTTTGAGAGGTGGGGAGAGTTTTTTAAAGGGGGAAAGAAGGAGAATGAGAGGGTTGAG
>Name_of_the_ specie_3 
ACATGCGATGAGGGCCCTTTATAATTTAATTGCGCTAAATAGATCTCCCT TTAGGGATATTGAGGAAGAGGAGGGATTAGGAGATTAGAGAGATAT
ADD COMMENT
0
Entering edit mode

thank you, I was trying to do it in awk and I was doing it wrong... now I know for the next time !

ADD REPLY
3
Entering edit mode
12.7 years ago

If it is one line per name and sequence, you can use this command:

perl -ane 'print ">$F[o]\n$F[1]\n";' yourFile.txt > yourFile.fasta
ADD COMMENT
0
Entering edit mode

many thanks! :)

ADD REPLY
2
Entering edit mode
12.7 years ago
Daniel ★ 4.0k

If your file is really as simple as you say then just do:

sed 's/^/>/' file1 >file2
sed 's/ /\n/' file2 >file3.fasta
ADD COMMENT
1
Entering edit mode

You can do it in one line and avoid that intermediate file: sed -e 's/^/>/' -e 's/\ /\n/' inputfile > outputfile

ADD REPLY
0
Entering edit mode

as long as there are no spaces in your species name...

ADD REPLY
0
Entering edit mode

thanks! yes, I didn't have spaces in the names :)

ADD REPLY

Login before adding your answer.

Traffic: 1931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6