I hope you are doing well. I have a program that needs to make BLAST requests. However, all of the methods that I have tried are dreadfully slow.
I have tried using BioPythons NCBIWWW as well as remote BLAST+, but both are horrendously slow. A search of the nt database for human sequences matching '"AATGCATGTAGTCAT"' takes ~15s to run in the web-portal, but 10+ minutes through BLAST+ and NCBIWWW.
Can anyone suggest a faster alternative?
I have avoided setting up a local BLAST database because of I was worried it would be difficult and consume a lot of resources. However, now I am starting to think that it may be my only option.
People can run it themselves, but we also have a hosted cloud service. We run normal NCBI BLAST on normal databases. (we add security, authentication, logging, and for people using the graphical interface, we add pretty pictures)
If you’re willing to pay to use that it may be a viable alternative. It would likely be much cheaper than running a big enough server yourself full time.
Maybe ping me a brief message at contact@sequenceserver.com with more details on what your queries look like, how many/often to make sure we could accommodate?
Using the blast command line tools, you can run remote blast searches like this
blastn -db nt -query test.fa -remote
You don't need to write Python scripts.
That said, I believe the remote blast searches run via a different mechanism/priority than the web-based blasts. The command line remote blasts seem to be heavily restricted (or perhaps run via a different program). I heard NCBI has a high performance blast alternative developed for their internal needs, and I am guessing that runs on the web interface)
For what is worth, make sure you need blast to begin with. There are other tools that work like blast and are much faster, diamond for example.