Hello, I am a Computer Science & Molecular Biology Student at MIT. I am currently working on my research project in the Research Laboratory of Electronics and I am trying to solve a problem while using Blast locally but I have been running in to problems doing so even though I am doing exactly what the documentation needs me to do. I was wondering if there is anyone used blast locally before? There's no one in my lab who can help me with it so was wondering if there's anyone who can help me with this?
Basically to get a blast output I am using the command line:
result_handler = NcbiblastnCommandline(query= "r70.fasta", db = "MG1655.fasta", out = 'resultr.xml')
os.system(str(result_handler))
But it does not output anything in the resultr.xml file # MG1655.fasta is the data base that I want to search against. I tried the doing the MG1655.txt version too and it still did not work.
Thanks,
Does the BLAST search run properly outside of BioPython? Do you have BLAST installed on the machine you're working with?
In addition to matt's comment, you should check out the value returned by
os.system
which will give your he exit status of your call (i.e. let you know if it worked). Usingsubprocess
instead will give you more information. Also, if you want xml you should setoutfmt
to5
and thedb
should be a blast db as created bymakeblastdb
(not a .fasta file, which it might be now?).How would I go about creating the blast db using makeblastdb? I tried looking at the documentation but it did not seem very clear.
Thanks,
something like
makeblastdb -in MG1655.fasta -out MG1655 -dbtype nucl
the help frommakeblastdb -h
is pretty good, I think.Update: I created the database and made sure blast was correctly installed.
However, the database is not being read I think. Are there ways of reading databases in biopython? Thanks a lot for the help!
You shouldn't set the database name as
MG1655.nsq
- assuming DB creation worked and you have files namedMG1655.nsq
,MG1655.nhr
, etc then BLAST expects you to refer to the database asMG1655
(without these extensions).As an alternative to using
os.system
, you could have asked Biopython to run BLASTN withstderr, stdout = result_handler()
which would give an error message if the command failed (non-zero return code), and captures and logging output as strings (stdout and stderr).The most likely problems are your files are not in the current directory, you didn't create a BLAST database, or BLAST is not installed on your
$PATH
.Hello, I got it to work. Thank you for all your help. I have another question related to the same project though. When I check my results with the online version of blast it seems like the stand alone blast is only returning to me highly similar sequences result (the results if I choose the megablast option in the online version). How do I reduce the specificity of this to return to me all the results somewhat similar sequences (blastn)?
Thank you.