So this is a follow up to my previous question. Thanks to @mdml. My previous question about How to align and compare two elements (sequence) in a list using python have been solved. Here is the code that I'm using (Code credit to mdml):
# Parse the file which was already split into split_list
lines = open("seq.txt")
for list in lines:
split_list = list.split()
header = "".join(split_list[0:2])
seq = split_list[2]
disorder = split_list[4]
# Create the new disorder string
new_disorder = ["Disorder: Posi R"]
for i, x in enumerate(disorder):
if x == "X":
# Appends of the form: "AminoAcid Position"
new_disorder.append("{} {}".format(i, seq[i]))
new_disorder = " ".join(new_disorder)
# Output the modified file
open("seq2.txt", "w").write( "\n".join([header, seq, new_disorder]))
This code work perfectly with my example which is:
103L Sequence: MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNSLDAAKSELDKAIGRNTNGVITKDEAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSRWYNQTPNRAKRVITTFRTGTWDAYKNL Disorder: ----------------------------------XXXXXX-----------------------------------------------------------------------------------------------------------------------------XX
However when I use this code for multiple protein sequence. It still work, but only last protein sequence and it's disordered region showed up in the new file. What should I do to fix it?