I want to blast some EST with local blast. However, for some sequences, there are no matches while with the website http://www.arabidopsis.org/Blast/index.jsp.(I download the database from the site, too. so the database I used locally is same to the website.) there are do some matches found. I don't know what is the problem and how can I fix it?
I used blast 2.2.25+, built the database with this command:
makeblastdb -in TAIR10_cdna.fast -out TAIR10_cdna -dbtype nucl -input_type fasta
next I did the blast:
blastn -query buff.fa -db TAIR10_cdna -out cx274252 -dust yes -max_target_seqs 250 -penalty -3 -outfmt 4 -gapopen 5 -gapextend 2
the output like this:
BLASTN 2.2.25+
Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb
Miller (2000), "A greedy algorithm for aligning DNA sequences", J
Comput Biol 2000; 7(1-2):203-14.
Database: TAIR10_cdna.fast
41,671 sequences; 64,867,051 total letters
Query= CX274252
Length=662
***** No hits found *****
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Effective search space used: 41291330612
Database: TAIR10_cdna.fast
Posted date: May 9, 2011 11:18 PM
Number of letters in database: 64,867,051
Number of sequences in database: 41,671
Matrix: blastn matrix 1 -3
Gap Penalties: Existence: 5, Extension: 2
while the results from the website was:
BLASTN 2.2.17 [Aug-26-2007]
Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= CX274252
(662 letters)
Database: TAIR10 Transcripts (-introns, +UTRs) (DNA)
41,671 sequences; 64,867,051 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT4G27160.1 | Symbols: AT2S3, SESA3 | seed storage albumin ... 64 2e-09
AT4G27140.1 | Symbols: SESA1, AT2S1 | seed storage albumin ... 48 1e-04
AT4G27150.1 | Symbols: SESA2, AT2S2 | seed storage albumin ... 44 0.002
AT1G14170.3 | Symbols: | RNA-binding KH domain-containing ... 44 0.002
AT1G14170.2 | Symbols: | RNA-binding KH domain-containing ... 44 0.002
AT1G14170.1 | Symbols: | RNA-binding KH domain-containing ... 44 0.002
AT4G27170.1 | Symbols: SESA4, AT2S4 | seed storage albumin ... 42 0.009
AT4G00895.1 | Symbols: | ATPase, F1 complex, OSCP/delta su... 36 0.53
.............( this is very long list, so I bypassed some contents)
Database: TAIR10 Transcripts (-introns, +UTRs) (DNA)
Posted date: Jan 13, 2011 1:41 PM
Number of letters in database: 64,867,051
Number of sequences in database: 41,671
Lambda K H
1.37 0.711 1.31
Gapped
Lambda K H
1.37 0.711 1.31
Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
Number of Sequences: 41671
Number of Hits to DB: 342,107
Number of extensions: 18701
Number of successful extensions: 1355
Number of sequences better than 10.0: 41
Number of HSP's gapped: 1354
Number of HSP's successfully gapped: 53
Length of query: 662
Length of database: 64,867,051
Length adjustment: 18
Effective length of query: 644
Effective length of database: 64,116,973
Effective search space: 41291330612
Effective search space used: 41291330612
X1: 11 (21.8 bits)
X2: 15 (29.7 bits)
X3: 25 (49.6 bits)
S1: 13 (26.3 bits)
S2: 16 (32.2 bits)
I set the parameters mostly as same as the website setting excepting the weighted matrix and max_score that I don't know how to set them.
Hi Zhizhong and welcome to Biostars. You should probably give more details about what you tried. What kind of blasts did you try? What arethe website and local databases numbers? What are the options you have selected on the site and the command you have used on your computer? What version of the blast algorithm have you used on your machine? Etc. You may even post one result for the same sequence in both cases (server vs. local). Cheers
thanks for your reminding, I edited the question and posted all the output results for one same sequence.
Most likely answer, assuming you set up local BLAST correctly, is that local and web BLAST used slightly different parameters. As Eric says, we need more details to answer the question.
Hi Zhizhong. If you have found your solution, you can take the time to write a clear answer to your own question and mark it as solved. There is nothing against that. Just make sure that both the question and the answer are well formated. This way, it has more chances of being useful to others. Cheers!
Are you using the parameters e-value (0.01), filter, composition statistics as in the database version ? One or more of this parameters can affect your results.
Now I fixed it with changing -task blastn and -reward 1. I would like to delete this ask if it is no use to others.