Entering edit mode
9.5 years ago
oliver.bayfield
▴
210
I simply want to reduce the headers in a fasta file from the long version below to simply the gi. i.e.
>gi|103058628|gb|DQ517338.1| Staphylococcus phage 80alpha, complete sequence
AGGTATCTGCATAGTTATTCCGAACTTCCAATTAATAAAACTCTATACCCGTAATCTTCAATGAGTTCTG
GCGCTTCCCTTTAATTCCTTTTACATATTCAAAATGAATGTTTTTGATTGCCATCTTTATGAATTCAGTT
TTTAACTCATCTTCCATTAATTCCCAGCCGTTTAGCAATGAATACTTGAAATTTTTAATCTTCTCATAGT
To:
>103058628
AGGTATCTGCATAGTTATTCCGAACTTCCAATTAATAAAACTCTATACCCGTAATCTTCAATGAGTTCTG
GCGCTTCCCTTTAATTCCTTTTACATATTCAAAATGAATGTTTTTGATTGCCATCTTTATGAATTCAGTT
TTTAACTCATCTTCCATTAATTCCCAGCCGTTTAGCAATGAATACTTGAAATTTTTAATCTTCTCATAGT
I'm guessing awk or grep has the technology!
Tip: awk and grep extract things, sed alters things. See Pierre's answer.