Remove characters before the last underscore and replace them with a numerical value
0
0
Entering edit mode
2.5 years ago
Percy • 0

How do I edit and remove similar characters from a fasta file headers before the last underscore and replace them with a numerical value provided they are similar . for example a section of my headers look like this :

>NODE_23_length_59792_cov_23.204747_1 
>NODE_23_length_59792_cov_23.204747_2 
>NODE_23_length_59792_cov_23.204747_3 
>NODE_23_length_59792_cov_23.204747_4 
>NODE_23_length_59792_cov_23.204747_5 
>NODE_23_length_59792_cov_23.204747_6 
>NODE_23_length_59792_cov_23.204747_7 
>NODE_23_length_59792_cov_23.204747_8 

the desired output is :

>1_1 
>1_2 
>1_3 
>1_4 
>1_5 
>1_6 
>1_7 
>1_8 
python linux perl • 1.1k views
ADD COMMENT
1
Entering edit mode

Just curious: why is the replacement better than what you already have? There is important information in that header that will be completely lost, and saving a couple of characters in disk space doesn't seem worth it.

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Up to what character they will not be similar? for eg. after node, after length, after cov?

ADD REPLY
0
Entering edit mode

the last digit after the last underscore

ADD REPLY

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6