What's the best way to quickly use BLAST on a defined set of organisms?
0
0
Entering edit mode
4.0 years ago
geosmin ▴ 20

I have a set of organisms that I want to perform BLAST searches on with the same query. Of course I want to automate this procedure instead of running BLAST individually for each organism. I've tried these three approaches:

  • Biopython's NCBIWWW.qblast() method (very slow)
  • blast+ program with local database (problems setting up the database)
  • download genome data via FTP and using makeblastdb (didn't understand how yet)

Before I dive deeper into each topic, I wanted to ask what other people would do. It seems like such a trivial task, so I guess there must be a somewhat simple procedure.

blast • 1.2k views
ADD COMMENT
1
Entering edit mode

What is the size of the query? You may be able to do a -remote blast+ search at NCBI (you will need to download compiled blast+) using command-line blast and limiting the searches to specific organism/taxonomy ID's.

ADD REPLY
0
Entering edit mode

Thank you for your answer! My query is about 500 amino acids long. It seemed to work at first, but then it took forever again when I specified an Entrez search query. I'm not sure if it's due to that or just a server connection problem (I've had that many times).

ADD REPLY
1
Entering edit mode

Try using the taxID limits. That may work better than the entrez query. Depending on how many organisms you need you could download their genomic protein sequence (.faa files) from respective genome directories, if you are doing a blastp. This is not that difficult to do, if you are already familiar with unix/blast+ command line.

ADD REPLY
0
Entering edit mode

Has anyone found code to Blast on a defined set of of organisms using Biopython's NCBIWWW.qblast() method?

ADD REPLY

Login before adding your answer.

Traffic: 1632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6