Entering edit mode
8.1 years ago
oshin707
▴
10
MOD-EDIT: This is a lead-in from here: dotplot with python
So for my assignment i was asked to align two proteins, 70 residues of one and 20 residues of the other and show them in '*' graph
the code i used was:
FASTAa=open("P04637.fas", "r") # r means read
header=FASTAa.readline()
a=""
for ll in FASTAa:
#print ll
a+=ll.rstrip()
al=list(FASTAa)
FASTAb=open("Q0VCX4.fas", "r") # r means read
header=FASTAb.readline()
b=""
for ll in FASTAb:
#print ll
b+=ll.rstrip()
bl=list(FASTAb)
print a[0:69]
print b[0:19]
a1=list("MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEA")
b1=str("MATQADLMELDMAMEPDRK")
print ''.join(a1)
for i in b1:
print "\n" "%s" %i,
for x in a1 :
if i==x in b1:
print '*',
else:
print ' ' ,
which gives me a graph BUT they are not aligned!! what am i doing wrong?
Please show what you got and what you aimed for. In addition, when parsing fasta files it is recommended to use SeqIO from Biopython.
so yesterday it gave me a graph with many '' which didnot align. But today it is giving me a graph with 1 '' and it still doesnt align.
i used the same code:
And the graph it gives is: MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEA
M A T Q A D L M E L D M A M E P D R K *
Can anyone please help me :(
Please use the 101010 button to properly format code to make it readable.
Also, it still isn't clear what you want and what you get. You write some letters, either continuous or spaced and with a star. How that's a graph is beyond me, for now, so please clarify. If you use the exact same code with the exact same input and get something different compared to yesterday you probably didn't use exactly the same.
you might use biopython for obtaining the alignment, if you are allowed to use existing libraries
http://biopython.org/DIST/docs/api/Bio.pairwise2-module.html
otherwise this documentation will show some of the considerations which go into alignment (e.g.: you could have different penalty against gaps)