Hi,
I've been using CLCBio to blast assembled contigs, but it's really slow. I decided to try setting up a local blast database and using that to blast my contigs, but I'm getting different results even though I'm using the same parameters. The parameters are below:
CLCBio:
Query genetic code: 1 Standard
Limit by entrez query: All organisms
Filter low complexity
Expect: 10
Word Size: 3
Matrix: BLOSUM62
Gap cost: Existence 11, Extension 1
Max number of hit sequences: 3
Local Blast:
blastx -db nr \
-query ../results/contigs/CLC-contigs.fa \
-evalue 10 \
-matrix 'BLOSUM62' \
-word_size 3 \
-gapopen 11 \
-gapextend 1 \
-max_target_seqs 3 \
-outfmt "10 std stitle" \
-out ../results/blast/blast-005.csv \
-num_threads 4
Since CLCBio and blast+ are using the same parameters and the same query sequence and the same database, I should the same results right? But I'm getting 226 hits in CLCBio that aren't in the local blast. Of these species, 2 are extremely important and are known to be in the query sequence.
Any ideas?
Thanks!
check that you database is indeed the same, could be different version,
How can I check that? Is there a list of when the databases are updated? The CLCBio blast was done remotely a few days before the local blast so this could be the problem.