Hi, I want to extract all the headers from my fasta file. Here is my example:
>Eukaryota;Alveolata;Dinoflagellata;Dinophyceae;Peridiniales;Kryptoperidiniaceae;Unruhdinium;Unruhdinium_kevei;
ATGCTTGTCTCAAAGATTAAGCCA......
All I want is extracting the line starting with the ">", and separate each name (which before the ";") into different columns, and put them into a CSV file.
I know really know how to do, and I really need some help!
OP wants comma-separated output. You may want to amend your solution accordingly.
Yikes, amended to be CSV
It will leave the initial
>
in. If that is not wanted then it can be removed by an extension of solution above.