My code so far:
temp_line = ''
out_lines = []
with open('dna.fasta.py', 'r') as f_in:
text = f_in.read()
text = text.split()
print(text)
for line in text:
line = line.strip('\n')
if line[0] == '>':
temp_line = line.strip('>')
out_lines.append(temp_line)
else:
out_lines.append(line)
print (out_lines)
print (temp_line)
with open('dna.phylip.py', 'w') as file_out:
file_out.write('\n'.join(out_lines))
My Input file: dna.fasta.py
>Human
AGCATGCATCGATCGATCGACTAGCTAGCG
>Chimp
GATATGTCGAGATCGTCAGCTCGATCAGCT
>Gorilla
TGTGTCGATCTCGAGCTGAGTCGTCTATCA
My output file so far: dna.phylip file
Human
AGCATGCATCGATCGATCGACTAGCTAGCG
Chimp
GATATGTCGAGATCGTCAGCTCGATCAGCT
Gorilla
TGTGTCGATCTCGAGCTGAGTCGTCTATCA
Correct Format of the output phylip file/ What it should look like:
Human AGCATGCATCGATCGATCGACTAGCTAGCG
Chimp GATATGTCGAGATCGTCAGCTCGATCAGCT
Gorilla TGTGTCGATCTCGAGCTGAGTCGTCTATCA
I have no idea how to remove the '/n' and add '/t', I've tried the strip and split function (to remove the new line), to add the tab, I've tried joining '/t' but my methods haven't worked. I don't what I'm doing wrong.
It does not work for multi lined fasta files also does not work if there are different words in fasta header like the one below:
Hi Goutham,
Thanks for pointing that out :)
Fixed the problems, here is the new one liner:
I hope it is helpful for the person who posted the question.