Here's the code:
sed -i 's/>\(.*\)\(....\)/>\2\1/' myfile.fasta
Here's the why, so you can learn for next time:
sed -i ## The -i flag means "do this on the file rather than outputing to a new one"
's/abc/def/' ## s/ means to substitute what is within the first set of forward slashes / / with the second (all enclosed in single quotes)
\(12345\) ## Stuff found between brackets can be referred to by \NUMBER in the next section
>\(.*\) ## here we say "starts with a greater than symbol '>', then '.*' means 'any character, any number of times', so we collect everything after a '>' into number 1 (for later) by putting it between \( and \).
\(....\) ## EXCEPT! for the last four characters. A period '.' means any one character, so doing 4 means put the last 4 characters into number2 for later.
>\2\1 ## Now we say what we want to swap what we matched with, so we want the '>' symbol again, then the second thing we matched '\2' then the first thing '\1'
myfile.fasta ## Lastly, put the file you're working on.
Hope this helps you or others. #LazyFridayAfternoon.
Edit: If you wanted to lose the begining colon, and add a period afterwards, this would work:
sed -i 's/>\(.*\):\(...\)/>\2.\1/' myfile.fasta
That says the ':' is outside of what you want to keep, then you keep only 3 characters, and then add a '.' after the match in the second half of the expression.
#SuperLazyFridayAfternoon
Are you sure its always 4 characters and not "the characters after the last semicolon, plus the semicolon itself"?
In this case, the last 4 characters work (they range btw 133-166). However, I also take your suggestion as an option, thanks!