Different results between remote and online blast
1
0
Entering edit mode
4.8 years ago

I perform remote blast with this command line:

blastn -task megablast -num_threads 6 -outfmt "6 std ssciname" -max_target_seqs 10 -db nt_v5 -query input_file -taxidlist Plants_DB -out blastfile

So my questions are: 1) Is there a way to perform discontiguous blast remotely? 2) when I take the same results and compare it to an online blast, the results are different. Can anyone tell me how to get the same result? Thanks

blastn sequencing alignment • 2.0k views
ADD COMMENT
2
Entering edit mode
4.8 years ago

I assume megablast is different that running a discontinuous blast (dc-megablast) This is taken after running blastn -help

-task <String, Permissible values: 'blastn' 'blastn-short' 'dc-megablast'
            'megablast' 'rmblastn' >

Another source of variation could be the database used. The local one cannot necessarily coincide with the remote if the newest sequences are not included. I believe they update the databases at a monthly pace

ADD COMMENT
0
Entering edit mode

But the results (species name) should be the same... and in the remote blast I get less species results than in online.

ADD REPLY
0
Entering edit mode

The species names can vary depending upon the way you search. By running a dc-blast, you use a smaller word in the searching than when using megablast, so you can fill out the list with less conserved sequences that lead to a different species list

ADD REPLY
0
Entering edit mode

To ensure the same results local and remotely, you should not only use the same kind of task (megablast or dc-megablast), but also the same gap and extension penalties, and the same score rule when bases coincide or not in the sequence

ADD REPLY
0
Entering edit mode

Can you tell me wich option I get to chose that match same score rule? I just changed the rest of the parameters...

ADD REPLY
0
Entering edit mode

If you are using the WEB NCBI service, go to the Algorithm parameter tab where you can see all of the options, like the match/mismatch scores, the word size (which differentiates a megablast from a dc-megablast), and the gap options which penalize gap creation and extension in a linear fashion or with discrete values (for existence of gaps and extension)

Then, run blastn -help to see all the options you have, and simply try to make both searches identicalby using the same rules applied in the WEB service

But you will see that teh WEB page is far more easy to use and have less configuration options

An example of code you can use in the local blastn

> > -word_size <Integer, >=4>    Word size for wordfinder algorithm (length of best perfect match)  -gapopen <Integer>    Cost to open a
> gap  -gapextend <Integer>    Cost to extend a gap  -penalty <Integer,
> <=0>    Penalty for a nucleotide mismatch
ADD REPLY
0
Entering edit mode

Now I noticed a different result... when I looked ate the seq id results from the remote blast, the taxID doesn't match the name online. For instances, for this seqid in remote blast: KX677575.1 I get "Vaccinium oxycoccos", but when I put that ID in pubmed>>nucleotide I get Corylus avellana. How can this happen? Can it be, because some of the blast option are different?

ADD REPLY
1
Entering edit mode

I don't think so. Accession values or codes are (or should be) unique to each sequence. I don't catch this point, nor I have an adequate answer to explain this

ADD REPLY

Login before adding your answer.

Traffic: 1552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6