Change lines in a file within a loop in Python
1
0
Entering edit mode
2.1 years ago
Paula ▴ 60

Hi All!

I am writing a code to change all the lines that contain the word 'NODE' with a new string defined as a variable called "new_line"

Here is the structure of the file

>NODE_1_length_592822_cov_338.586386_1 # 2 # 169 # -1 # ID=1_1;partial=10;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.518
MRMKGVRSLREMTRLLDTDQRLRKLCLIKTCEAAYPRSVLSRFIRKVGEDNLTRII

>NODE_1_length_592822_cov_338.586386_2 # 417 # 695 # 1 # ID=1_2;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.387
MPSLTEKKPPLVDIFEEGDYLRVLAELPSIDEKDISIETDGSTITITAENETKKYLKIVR
LPTHIKKGPIEFTHKNNILQVRLKKLCDAKDT*

Here is the desired final result

>SOL_1_3_cov_338.586386_N_1
MRMKGVRSLREMTRLLDTDQRLRKLCLIKTCEAAYPRSVLSRFIRKVGEDNLTRII

>SOL_1_3_cov_338.586386_N_2
MPSLTEKKPPLVDIFEEGDYLRVLAELPSIDEKDISIETDGSTITITAENETKKYLKIVR
LPTHIKKGPIEFTHKNNILQVRLKKLCDAKDT*

Here is the code:

i = 0
with open("scaffolds_to_rename.faa","r+") as a_file:
    for line in a_file:
       if 'NODE' in line:

             a = line.split('_')
             print(a)
             coverage = a[4]
             print(coverage)
             coverage_value = a[5]
             print(coverage_value)
              i = i + 1
              new_line = '>' + 'SOL_1_3' + '_' + str(coverage) + '_' + str(coverage_value) + '_'  + 'N' + '_' +  str(I)
              print (new_line)
              line = str(new_line)

THANKS!

loop python replace • 2.5k views
ADD COMMENT
0
Entering edit mode

So you are trying to change the headers in a fasta file. What exactly is your question, except that you need to remove some redundant debug print's and also need to print lines that don't contain NODE? Also you will want to change your if statement to if line.startswith(">NODE"):

ADD REPLY
0
Entering edit mode

Hi Michael! Thanks for your reply! I am wondering how can I replace the lines that start with ">NODE" with the contents of the variable "new_line". As of now, the code creates the new line, but it does not replace the old lines with the new lines.

ADD REPLY
0
Entering edit mode

You need to print each line anyway, so there is no need for that, see the answer. You don't want to try to in-place edit the same file, do you? Leave the input file as it is, and create a new output file.

ADD REPLY
0
Entering edit mode
2.1 years ago
Michael 55k

Something like that (untested), the output will be on stdout:

i = 0
with open("scaffolds_to_rename.faa","r+") as a_file:
    for line in a_file:
       if line.startswith('>NODE'):  # make sure only lines beginning with >NODE are changed
             a = line.split('_')
             coverage = a[4]
             coverage_value = a[5]
             i = i + 1
             line = '>SOL_1_3_' + str(coverage) + '_' + str(coverage_value) + '_N_' +  str(i)
        print(line)
ADD COMMENT

Login before adding your answer.

Traffic: 2246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6