This question concerns differences between the output of blastp-short using BLAST+ vs. the web interface.
When searching against RefSeq or nr (updated yesterday), BLAST+ misses some of the hits returned by the web interface, and sometimes assigns different bit scores to the same hits. Below is an example. The BLAST+ command was:
blastp \
-task blastp-short \
-db refseq_protein \
-query $f \
-out $g \
-evalue 1000000 \
-max_target_seqs 500 \
-outfmt "6 qseqid sgi sacc qstart qend sstart send evalue bitscore pident staxids"
The web interface settings were the standard settings (automatically use blastp short, etc.).
The top three hits returned by BLAST+ for the query sequence, GTADESVGAAR
, are:
RANK ACCESSION E VALUE BIT SCORE
1. WP_015164269 107 27.8
2. WP_010519043 132 27.8
3. XP_007813862 266 26.5
Some hits returned by the blastp (short) web interface for the same query are:
RANK ACCESSION E VALUE BIT SCORE
1. WP_003802252.1 82 28.2 # NOT REPORTED BY BLAST+
2. WP_015164269.1 115 27.8 # THE TOP BLAST+ HIT
34. XP_007813862.1 392 26.1 # THE THIRD BLAST+ HIT -- BIT SCORES DIFFER
It seems the blastp-short settings in BLAST+ and the web interface must differ to yield different hits which often have different bit scores. Does anyone know how they differ and how they can be reconciled? I'd like to recover more of the top-scoring hits reported by the web interface that are not found by BLAST+.
Thanks!