Entering edit mode
2.8 years ago
Princy
▴
60
Hello everyone, I need to remove some parts of the header of fasta file. How can I do it pls let me know.
>QBIY01000001.1 L r b J isolate D scaffold_1, whole genome shotgun sequence
TCGCTTCCAGTTCCGGGTCTCTCTGTTCACTTCCCccttggcggccatttcagcgtgcctcgccggctcgctcgtcgcgaagttttgtcggctatgtccccaactctgagcgttttccTATCGGACTGCTttactgttgccaaccggactgtcttTATCG
enter code here
I need my header like this
>QBIY01000001.1
TCGCTTCCAGTTCCGGGTCTCTCTGTTCACTTCCCccttggcggccatttcagcgtgcctcgccggctcgctcgtcgcgaagttttgtcggctatgtccccaactctgagcgttttccTATCGGACTGCTttactgttgccaaccggactgtcttTATCG
Hello thanks for reply, I have edited my question. its not from every line, I only want to change the header of fasta file. pl let me know how can i do it.
You can use the solution exactly the way I suggested it. The header has multiple fields and you want just the first one. All the other lines are themselves single fields, so they will be unaffected (that is, they will be included in the output). This solution will only fail if you have sequence lines separated by white space, which would be very unusual. It's basically the same solution @cpad0112 went on to suggest (using awk instead).