Hello! I'm new in bioinformatics and have a large fasta file with sequences and want to extract the header and put it in a new tab-delimited file. I also want to add a new header and put the new header in the tab-delimited file.
For example:
>AJ58900.1:1099-2670 Carius bactus
GCAAATATTTAGCCCAACATGAATCCAAAAAAAAATATTAAACAAATAAAACCAAAACATTTAATCATTT
The new file should look something like this:
CB Carius bactus AJ58900.1:1099-2670
And the fasta file should then look like this:
>CB
GCAAATATTTAGCCCAACATGAATCCAAAAAAAAATATTAAACAAATAAAACCAAAACATTTAATCATTT
So many steps! I can extract things from a file but this with replacing and stuff. Should I start with creating a tab-delimited file and put in the short new name in it and then replace the old header with the new and at the same time copy the old into the new with the new name?
Thanks in advance!
what is your end goal? Are you sure you want to replace the fasta header? Anyway, it's not too complicated. If you have a file mapping each existing header to the new header you can use BioPython to read the fasta file and replace the headers (record id). If you're not familiar with python I think it will be a nice exercise to start with :) Oh, and it's originally Karius and Baktus.