I performed a BLAST search with Biopython from a FASTA file with 92 entries (goat.fasta
)
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIXML
import pandas as pd
fasta_string = open("goat.fasta").read()
result_handle = NCBIWWW.qblast("blastx", sequence = fasta_string, database = "refseq_protein", entrez_query = 'txid9606[ORGN]')
blast_records = NCBIXML.parse(result_handle)
blast_record_list = list(blast_records)
The variable blast_record_list
has a list of blast record objects that I would like to pull out the title of each alignment. Here are my two attempts at doing this, as well as the errors that I got
for entry in blast_record_list:
for alignment in entry:
for hsp in alignment.hsps:
print("sequence:", alignment.title)
Traceback (most recent call last):
File "<ipython-input-42-bb44df962013>", line 2, in <module>
for alignment in entry:
TypeError: 'Blast' object is not iterable
And
for alignment in blast_record_list[0:len(blast_record_list)].alignments:
for hsp in alignment.hsps:
print("****Alignment****")
print("sequence:", alignment.title)
Traceback (most recent call last):
File "<ipython-input-41-033f221fbc66>", line 1, in <module>
for alignment in blast_record_list[0:len(blast_record_list)].alignments:
AttributeError: 'list' object has no attribute 'alignments'
Can someone tell me what I am doing wrong here?
You may want to consider using the
SearchIO
module of BioPython instead of Pandas for this, since it already has a number of methods and data structures for handling blast data.