Entering edit mode
10.5 years ago
Phil S.
▴
700
Hi there,
I have a usual looking fastA file like this:
>Translation: 2..112 (direct), 37 amino acids ADTAQEFISTAVFGTSMSAHHILGLKPVPRVWLFAI* >Translation: 1482..1790 (direct), 103 amino acids MKKYTEQAKLSVVEDYCSGSAGHREVAHRHGVNANVIRKWLPIYRDKPVAPLPAFVPLQP MPKRQADEAVVIALSLGDKSITVKWPISDPDGCARFIRSLSQ* >Translation: 1787..2122 (direct), 112 amino acids MIRIDAIWLATEPMDMRAGTETALVRVVAVFGAAKPHCAYLFANRRANRMKVLVHDGVGI WLAARRLNQGKFHWPGTHRGLEVGLDAEQLQALVLGLPWQRVGANGAITMI*
now what I want to do is to kind of rename the sequences with a number which has to be 5 digits long. That means the three sequences above should be named like this:
>orf00001 2..112 (direct), 37 amino acids ADTAQEFISTAVFGTSMSAHHILGLKPVPRVWLFAI* >orf00002 1482..1790 (direct), 103 amino acids MKKYTEQAKLSVVEDYCSGSAGHREVAHRHGVNANVIRKWLPIYRDKPVAPLPAFVPLQP MPKRQADEAVVIALSLGDKSITVKWPISDPDGCARFIRSLSQ* >orf00003 1787..2122 (direct), 112 amino acids MIRIDAIWLATEPMDMRAGTETALVRVVAVFGAAKPHCAYLFANRRANRMKVLVHDGVGI WLAARRLNQGKFHWPGTHRGLEVGLDAEQLQALVLGLPWQRVGANGAITMI*
So the 5 digits are fixed and I just need to count upwards seeing the '>' unfortunately I don't know how to make it a fixed length onto five digits.
Thanks for your help (once again ;) )
Best,
Phil
Thank you so much for the fast and correct answer. The only thing I had to adjust is a line brake...