Entering edit mode
13.4 years ago
Ankur
▴
40
Hi, I have the following code
def runBLAST(self):
print "Running BLAST .........."
cmd=subprocess.Popen("blastp -db nr -query repeat.txt -out out.faa -evalue 0.001 -gapopen 11 -gapextend 1 -matrix BLOSUM62 -remote -outfmt 5",shell=True)
cmd.communicate()[0]
f1=open("out.faa")
blast_records = NCBIXML.parse(f1)
save_file = open("my_fasta_seq.fasta", 'w')
for blast_record in blast_records[:10]:
for alignment in blast_record.alignments:
for hsp in alignment.hsps:
save_file.write('>%s\n' % (alignment.hseq,))
save_file.close()
f1.close()
f2=open("my_fasta_seq.fasta")
for record in SeqIO.parse(f2,"fasta"):
f=open("tempBLAST1.txt","w")
f.write(">"+"\n"+strrecord.name)+"\n"+str(record.seq)+"\n")
f.close()
I get the error on TypeError: for blastrecord in blastrecords[:10]: saying 'generator' object is not subscriptable. I am looking to get top 10 blast hits (sequences)
It's also a follow-up to the previous question and perhaps should have continued there instead. It's fine to edit your questions and discuss answers in the comments, rather than starting a new question for every variation of the same problem.
As Michael says, blast_records is a generator/iterator. You can loop over it or iterate explicitly by calling next(), but you cannot access records by index. This is a general design pattern for coping with very large files composed of multiple smaller records, also used in the the Biopython SeqIO parse function etc.