I want to BLAST one protein sequence Q6GZX4.fasta
against all the sequences in the file part0.fasta
(FASTA format) containing 5000 sequences. First I tried using the part0.fasta
directly as subject. Then I tried using a formatted database version of it (makeblastdb -in part0.fasta -title part0 -dbtype prot -out part0 -parse_seqids
).
Using -query Q6GZX4.fasta
and -subject part0.fasta
(case A) [output is line count]:
user% blastp -query Q6GZX4.fasta -subject part0.fasta -evalue 100 -max_target_seqs 5000 -max_hsps 1 -outfmt 6|wc -l
4572
Using -query Q6GZX4.fasta
and -db part0
(case B) [output is line count]:
user% blastp -query Q6GZX4.fasta -db part0 -evalue 100 -max_target_seqs 5000 -max_hsps 1 -outfmt 6|wc -l
43
Why do I get different results? 4572 hits in case A, but 43 in case B?
What happens when you run
blastdbcmd -info -db part0
?