Entering edit mode
2.1 years ago
Paula
▴
60
Hi All!
I am writing a code to change all the lines that contain the word 'NODE' with a new string defined as a variable called "new_line"
Here is the structure of the file
>NODE_1_length_592822_cov_338.586386_1 # 2 # 169 # -1 # ID=1_1;partial=10;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.518
MRMKGVRSLREMTRLLDTDQRLRKLCLIKTCEAAYPRSVLSRFIRKVGEDNLTRII
>NODE_1_length_592822_cov_338.586386_2 # 417 # 695 # 1 # ID=1_2;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.387
MPSLTEKKPPLVDIFEEGDYLRVLAELPSIDEKDISIETDGSTITITAENETKKYLKIVR
LPTHIKKGPIEFTHKNNILQVRLKKLCDAKDT*
Here is the desired final result
>SOL_1_3_cov_338.586386_N_1
MRMKGVRSLREMTRLLDTDQRLRKLCLIKTCEAAYPRSVLSRFIRKVGEDNLTRII
>SOL_1_3_cov_338.586386_N_2
MPSLTEKKPPLVDIFEEGDYLRVLAELPSIDEKDISIETDGSTITITAENETKKYLKIVR
LPTHIKKGPIEFTHKNNILQVRLKKLCDAKDT*
Here is the code:
i = 0
with open("scaffolds_to_rename.faa","r+") as a_file:
for line in a_file:
if 'NODE' in line:
a = line.split('_')
print(a)
coverage = a[4]
print(coverage)
coverage_value = a[5]
print(coverage_value)
i = i + 1
new_line = '>' + 'SOL_1_3' + '_' + str(coverage) + '_' + str(coverage_value) + '_' + 'N' + '_' + str(I)
print (new_line)
line = str(new_line)
THANKS!
So you are trying to change the headers in a fasta file. What exactly is your question, except that you need to remove some redundant debug print's and also need to print lines that don't contain NODE? Also you will want to change your if statement to
if line.startswith(">NODE"):
Hi Michael! Thanks for your reply! I am wondering how can I replace the lines that start with ">NODE" with the contents of the variable "new_line". As of now, the code creates the new line, but it does not replace the old lines with the new lines.
You need to print each line anyway, so there is no need for that, see the answer. You don't want to try to in-place edit the same file, do you? Leave the input file as it is, and create a new output file.