Is there a reason for the E Value to differ when using BLAST on the web and using BioPython? My understanding is that these are the same source, so I am unable to understand the following differences.
As an example, consider the sequence
tpdavmgnpk
Submit this to blastp and compare it with the following python script:
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIXML
peptide = "tpdavmgnpk"
myEntrezQuery = "Homo sapiens[Organism]"
result = NCBIWWW.qblast("blastp", "nr", peptide,entrez_query=myEntrezQuery)
records = NCBIXML.parse(result)
blast_record = records.next()
for alignment in blast_record.alignments:
for hsp in alignment.hsps:
if hsp.expect < 5:
print "***** RECORD ****"
print "sequence:", alignment.title
print "E-value:", hsp.expect
Here are two examples of differing E values I obtain
Accession, Biopython E value, NCBI web E value
AAW66689.1 1.20033, 0.045
AAA53153.1 1.21977, 0.075
EDIT: I have tried making the defaults similar (Peter's answer and Ben's comment) and this link:
result = NCBIWWW.qblast("blastp", "nr", peptide,entrez_query=myEntrezQuery,matrix_name='BLOSUM62',word_size='2',expect='50000',gapcosts='11 1',composition_based_statistics='no adjustment')
The results are still not matching.
Thanks!
The NCBI web interface adjusts BLAST parameters for short sequences - does BioPython?
Thanks for this insight. I have edited my question to address this.
I found this from a few years ago that may be helpful (with some adjustment)