Fasta Module In Biopython-Python, Again

1

Entering edit mode

13.3 years ago

Jovana ▴ 10

Hello, I have a similar question regarding Fasta module in Biopython. I am using disembl predictor Disembl.py) for protein disorder and I have a mistake ImportError: cannot import name Fasta. I read an answer in "Fasta module in Python/Biopython" record - to change Fasta with SeqIO but since I don't know Python, I didn't succeed to change the script. The problem is in the following two lines:

parser = Fasta.RecordParser()
iterator = Fasta.Iterator(db,parser)

which functions should I use instead?

and later, in line:

cur_record = iterator.next()

what to use instead of this? I tried to do the same as in suggested in "Fasta module in Python/Biopython" (

	def runGlobPlot():
	try:
	smoothFrame = int(sys.argv[1])
	DOM_joinFrame = int(sys.argv[2])
	DOM_peakFrame = int(sys.argv[3])
	DIS_joinFrame = int(sys.argv[4])
	DIS_peakFrame = int(sys.argv[5])
	file = str(sys.argv[6])
	db = open(file,'r')
	except:
	print 'Usage:'
	print ' ./GlobPipe.py SmoothFrame DOMjoinFrame DOMpeakFrame DISjoinFrame DISpeakFrame FASTAfile'
	print ' Optimised for ELM: ./GlobPlot.py 10 8 75 8 8 sequence_file'
	print ' Webserver settings: ./GlobPlot.py 10 15 74 4 5 sequence_file'
	raise SystemExit
	for cur_record in SeqIO.parse(db, "fasta"):
	#uppercase is searchspace
	seq = upper(str(cur_record.seq))
	# sum function
	sum_vector = Sum(seq,RL)
	# Run Savitzky-Golay
	smooth = SavitzkyGolay(`smoothFrame`,0, sum_vector)
	dydx_vector = SavitzkyGolay(`smoothFrame`,1, sum_vector)
	#test
	sumHEAD = sum_vector[:smoothFrame]
	sumTAIL = sum_vector[len(sum_vector)-smoothFrame:]
	newHEAD = []
	newTAIL = []
	for i in range(len(sumHEAD)):
	try:
	dHEAD = (sumHEAD[i+1]-sumHEAD[i])/2
	except:
	dHEAD = (sumHEAD[i]-sumHEAD[i-1])/2
	try:
	dTAIL = (sumTAIL[i+1]-sumTAIL[i])/2
	except:
	dTAIL = (sumTAIL[i]-sumTAIL[i-1])/2
	newHEAD.append(dHEAD)
	newTAIL.append(dTAIL)
	dydx_vector[:smoothFrame] = newHEAD
	dydx_vector[len(dydx_vector)-smoothFrame:] = newTAIL
	globdoms, globdis = getSlices(dydx_vector, DOM_joinFrame, DOM_peakFrame, DIS_joinFrame, DIS_peakFrame)
	s_domMask, coordstrDOM = reportSlicesTXT(globdoms, seq, 'DOM')
	s_final, coordstrDIS = reportSlicesTXT(globdis, s_domMask, 'DIS')
	sys.stdout.write('>'+cur_record.id+coordstrDOM+coordstrDIS+'\n')
	print s_final
	print '\n'
	return

view raw updated_GlobPipe.py hosted with ❤ by GitHub

) but it works only once..

biopython fasta • 3.9k views

ADD COMMENT • link updated 13.2 years ago by Brad Chapman 9.7k • written 13.3 years ago by Jovana ▴ 10

6

Entering edit mode

13.2 years ago

Brad Chapman 9.7k

Here is an updated version of DisEMBL.py using SeqIO:

https://gist.github.com/1675927

The original version is available from:

http://dis.embl.de/

ADD COMMENT • link 13.2 years ago by Brad Chapman 9.7k

1

Entering edit mode

Nice, just what I was looking for, thanks and +1. Thought I'd point out a typo in the gist I just downloaded, line 152 sys.stdout.write('> '+cur_record.id'_COILS ') should be sys.stdout.write('> '+cur_record.id+'_COILS ') (notice the + before '_COILS), otherwise you get a SyntaxError: invalid syntax error..

ADD REPLY • link 12.3 years ago by terdon ▴ 430

Login before adding your answer.