I perform remote blast with this command line:
blastn -task megablast -num_threads 6 -outfmt "6 std ssciname" -max_target_seqs 10 -db nt_v5 -query input_file -taxidlist Plants_DB -out blastfile
So my questions are: 1) Is there a way to perform discontiguous blast remotely? 2) when I take the same results and compare it to an online blast, the results are different. Can anyone tell me how to get the same result? Thanks
But the results (species name) should be the same... and in the remote blast I get less species results than in online.
The species names can vary depending upon the way you search. By running a dc-blast, you use a smaller word in the searching than when using megablast, so you can fill out the list with less conserved sequences that lead to a different species list
To ensure the same results local and remotely, you should not only use the same kind of task (megablast or dc-megablast), but also the same gap and extension penalties, and the same score rule when bases coincide or not in the sequence
Can you tell me wich option I get to chose that match same score rule? I just changed the rest of the parameters...
If you are using the WEB NCBI service, go to the Algorithm parameter tab where you can see all of the options, like the match/mismatch scores, the word size (which differentiates a megablast from a dc-megablast), and the gap options which penalize gap creation and extension in a linear fashion or with discrete values (for existence of gaps and extension)
Then, run
blastn -help
to see all the options you have, and simply try to make both searches identicalby using the same rules applied in the WEB serviceBut you will see that teh WEB page is far more easy to use and have less configuration options
An example of code you can use in the local blastn
Now I noticed a different result... when I looked ate the seq id results from the remote blast, the taxID doesn't match the name online. For instances, for this seqid in remote blast: KX677575.1 I get "Vaccinium oxycoccos", but when I put that ID in pubmed>>nucleotide I get Corylus avellana. How can this happen? Can it be, because some of the blast option are different?
I don't think so. Accession values or codes are (or should be) unique to each sequence. I don't catch this point, nor I have an adequate answer to explain this