Hi all
I am new to Biostars and blast. I am trying to convert a (test)-database of 25 sequences into a blast-database using standard tutorial commands. However, I get the following error:
Error: mdb_env_open: There is not enough space on the disk.
In my cd I get a file of >200gb which seems ridiculously high for only 25 sequences. Does anyone have an idea what I am doing wrong?
Kind regards
Michaƫl
I had the same problem with blast+ 2.10.0 and downloaded version 2.2.30 and it worked without any problem.
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/
That version is nearly 6! years old, I would not advise to use it anymore unless in very specific cases (of which this one is not)
did blast algorithm change in any substantial way since 2015? I think it was already very, very established by then - so there should be no difference whatsoever in the main components.
Yes it kinda did.
fro example from 2.8 onwards it uses a complete new database schema (though still backwards compatible) . The way it handles the alignment statistics and such has also been changed, as well as numerous other bug fixes and improvements. (eg. a hit e-value from the 2.2.30 will not be the same anymore in the 2.10 version )
you're of course free to still use the older version but than don't expect to be state-of-the art.
That's interesting info, thank you. What is the difference in e-value calculation - can you point me to where it's described? Would hit ranking still be preserved?
hmm, I am not so much into those details but looking through the change log of the blast releases might teach you something (I don't think there has been a manuscript on these) or here : https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=References
200gb is too large but you have to post the full command and maybe the database to get useful responses.
that would be
makeblastdb -in genes_blast.fsa -dbtype nucl -out test
Ok, now we need the size of genes_blast.fsa, please post the output of:
Edit: I think you are using the wrong file,
.fsa
might be output of makeblastdb already, the real input file might be.fna, .fasta
Just delete all output and make sure you only have a FASTA file in the working directory that looks like:
Then run the command again on that file.
Hi Changing to .fasta does not solve the problem.
Sorry, then the problem might be irreproducible because that looks like a small fasta file (does it have
>
at the beginning of each fasta header?) So that means something else might be wrong. If you want you can upload the input file somewhere (github, pastebin, etc. ) and we can have a try. Otherwise you need to contact NCBI supprt.I tried with just one sequence and even then I get the problem. I tried running in windows cmd and using R but always have the same problem.
Try reinstalling the latest version of the blast binaries for windows. Makeblastdb should be very stable and is used by many thousands regularly. So most likely the error is on your side, possibly in your (windows?) setup. I am sorry, but don't think we can solve this problem here.
minicola : Your fasta files do not appear to be in correct format.
It looks like they are missing a
>
at beginning of the fasta header. Is that correct?Sorry, that is a copy-paste error the ">" dissappeared when copying into text-field.
Hello, im getting the very same error. Tested on two computers with different systems. Using the most recent version of BLAST from NCBI download page. Seems like a bug on their side. C.
Are you using windows OS? I don't see this problem on linux with latest
blast+ v.2.10.0
.Yes, windows on both PCs. Different versions though.
If you are sure this is a bug then please report it to NCBI help desk. Beware that it may take them 2-3 business days to respond. It is also end-of-year now which may increase that time.
This seems to be a Win 10 - specific bug. I still have Windows 8 and latest blast works fine on my computer, but my wife with Windows 10 couldn't get it to work. Old version did the trick though.
Sorry to hit up an old thread, but I have the same issue when running jobs on a linux cluster. I'm running makeblastdb jobs on a subset of genes from 1 microbial genome, and I'm providing up to 288 Gb of memory per job, but I still get the error:
"makeblastdb -in internal_input.fasta -out internal_input.fasta -dbtype prot" -> Error: mdb_env_open: Cannot allocate memory
It appears that
BLASTDB_LMDB_MAP_SIZE
is only used for windows, and maybe that was only for older versions. I'm usingblast 2.10.1 bioconda
. I'm using that version of blast because it's a dependency for antismash 5.1.2Any idea what could be causing the memory allocation issue? I also checked the
qacct
job logs, and the max memory hit by the jobs is < 0.5Gb (prior to them throwing an error).I think that is a different error. It is best to contact NLM support directly.