Hi everyone! I am very new to both Python and Biopython and I am doing some protein sequencing work right now. Unfortunately, I have been having some trouble running BLAST searches with Biopython. I have been running the following code (Note: the third, super long parameter I'm using for qblast is the FASTA for prolactin that I found on NCBI.
import Bio
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIXML
result_handle = NCBIWWW.qblast("blastp","nt","""MNIKGSPWKGSLLLLLVSNLLLCQSVAPLPICPGGAARCQVTLRDLFDRAVVLSHYIHNLSSEMFSEFDK
RYTHGRGFITKAINSCHTSSLATPEDKEQAQQMNQKDFLSLIVSILRSWNEPLYHLVTEVRGMQEAPEAI
LSKAVEIEEQTKRLLEGMELIVSQVHPETKENEIYPVWSGLPSLQMADEESRLSAYYNLLHCLRRDSHKI
DNYLKLLKCRIIHNNNC*""")
with open ("my_blast.xml" , "w") as out_handle :
out_handle.write(result_handle.read())
result_handle.close
result_handle = open("my_blast.xml")
blast_record = NCBIXML.read(result_handle)
However, the XML file produced is the following:
http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastp</BlastOutput_program>
<BlastOutput_version>BLASTP 2.8.0+</BlastOutput_version>
<BlastOutput_reference>Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.</BlastOutput_reference>
<BlastOutput_db>nt</BlastOutput_db>
<BlastOutput_query-ID>Query_201655</BlastOutput_query-ID>
<BlastOutput_query-def>unnamed protein product</BlastOutput_query-def>
<BlastOutput_query-len>228</BlastOutput_query-len>
<BlastOutput_param>
<Parameters>
<Parameters_matrix>BLOSUM62</Parameters_matrix>
<Parameters_expect>10</Parameters_expect>
<Parameters_gap-open>11</Parameters_gap-open>
<Parameters_gap-extend>1</Parameters_gap-extend>
<Parameters_filter>F</Parameters_filter>
</Parameters>
</BlastOutput_param>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_201655</Iteration_query-ID>
<Iteration_query-def>unnamed protein product</Iteration_query-def>
<Iteration_query-len>228</Iteration_query-len>
<Iteration_hits>
</Iteration_hits>
<Iteration_stat>
<Statistics>
<Statistics_db-num>0</Statistics_db-num>
<Statistics_db-len>0</Statistics_db-len>
<Statistics_hsp-len>0</Statistics_hsp-len>
<Statistics_eff-space>0</Statistics_eff-space>
<Statistics_kappa>-1</Statistics_kappa>
<Statistics_lambda>-1</Statistics_lambda>
<Statistics_entropy>-1</Statistics_entropy>
</Statistics>
</Iteration_stat>
</Iteration>
</BlastOutput_iterations>
</BlastOutput>
I'm not sure what exactly I'm doing wrong. Sorry if this is a dumb mistake, but please let me know what I can do to fix this problem!