Entering edit mode
4 months ago
pinheirofabiano
▴
100
I'm writing the Python script below. I had some issues, but now it's working, my main difficulty now is to save the output (a list) in fasta format. Do I need to convert the list to a panda dataframe before? or there is a more straight forward way to do it? can you help me to find a pythonic answer?
sequence = input("Digit the peptide sequence: ")
peptides_80 = []
for i in range(0, len(sequence)-80):
peptides_80 += [''.join(sequence[i:i+80])]
ofile = open("subsequences.txt", "w")
for i in range(len(peptides_80)):
ofile.write(">" + " " + "subsequence" + [i] + "\n" + peptides_80[i] + "\n")
ofile.close()
First off, what on earth is
peptide(sequence):
? Are you missing adef
there? Also, why reinvent the wheel instead of simply using BioPython?many thanks, Ram
I formatted your code a bit and I have a quick question: Is the
print
statement supposed to be part of the loop? If so, please indent it 4 more spaces.no, print() is outside..
Ram, I've completed the code.. can you check it now?
Check your code, please.
not working yet.. txt file comes blank..
Try printing everything to screen first:
duplicate of your previous question convert list to fasta format . (Why did you delete it ? just update your original post)
many thanks.. yes, the post was getting confusing, I decide to rewrite it.. almost there, but not working yet.. the txt file is blank.. I believe the second for loop is not working..
done, working now! many thanks, Ram and Pierre Lindenbaum but only for long sequences, length > 80
What did you do to get it working? Please add the final working code as an answer.
just changed
[i]
forstr(i)
in the code above..it's made to work with long sequences.. length > 80. this script splits peptide or nucleic acids sequences with length greater than 80 in sub-sequences of length == 80. Next, it converts the output to FASTA and save it. it's useful, for example, if you want to investigate biological properties of a protein fragment.. to investigate, for example, protein binding sites.