Question

makeblastdb Fasta file with 25 sequences gives Error: mdb_env_open: There is not enough space on the disk

2

Entering edit mode

5.4 years ago

minicola ▴ 20

Hi all

I am new to Biostars and blast. I am trying to convert a (test)-database of 25 sequences into a blast-database using standard tutorial commands. However, I get the following error:

Error: mdb_env_open: There is not enough space on the disk.

In my cd I get a file of >200gb which seems ridiculously high for only 25 sequences. Does anyone have an idea what I am doing wrong?

Kind regards

Michaël

blast makeblastdb • 16k views

ADD COMMENT • link updated 4.1 years ago by alowi33 ▴ 50 • written 5.4 years ago by minicola ▴ 20

2

Entering edit mode

I had the same problem with blast+ 2.10.0 and downloaded version 2.2.30 and it worked without any problem.

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/

ADD REPLY • link 5.3 years ago by a.eivazi18 ▴ 20

0

Entering edit mode

That version is nearly 6! years old, I would not advise to use it anymore unless in very specific cases (of which this one is not)

ADD REPLY • link 5.3 years ago by lieven.sterck 15k

0

Entering edit mode

did blast algorithm change in any substantial way since 2015? I think it was already very, very established by then - so there should be no difference whatsoever in the main components.

ADD REPLY • link 5.1 years ago by predeus ★ 2.1k

1

Entering edit mode

Yes it kinda did.

fro example from 2.8 onwards it uses a complete new database schema (though still backwards compatible) . The way it handles the alignment statistics and such has also been changed, as well as numerous other bug fixes and improvements. (eg. a hit e-value from the 2.2.30 will not be the same anymore in the 2.10 version )

you're of course free to still use the older version but than don't expect to be state-of-the art.

ADD REPLY • link 4.5 years ago by lieven.sterck 15k

0

Entering edit mode

That's interesting info, thank you. What is the difference in e-value calculation - can you point me to where it's described? Would hit ranking still be preserved?

ADD REPLY • link 5.1 years ago by predeus ★ 2.1k

0

Entering edit mode

hmm, I am not so much into those details but looking through the change log of the blast releases might teach you something (I don't think there has been a manuscript on these) or here : https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=References

ADD REPLY • link 5.1 years ago by lieven.sterck 15k

0

Entering edit mode

200gb is too large but you have to post the full command and maybe the database to get useful responses.

ADD REPLY • link 5.4 years ago by Michael 55k

0

Entering edit mode

that would be

makeblastdb -in genes_blast.fsa -dbtype nucl -out test

ADD REPLY • link 5.4 years ago by minicola ▴ 20

0

Entering edit mode

Ok, now we need the size of genes_blast.fsa, please post the output of:

ls -lh genes_blast.fsa
wc genes_blast.fsa
head genes_blast.fsa

Edit: I think you are using the wrong file, .fsa might be output of makeblastdb already, the real input file might be .fna, .fasta

Just delete all output and make sure you only have a FASTA file in the working directory that looks like:

 >header
 ACGT
 >seq
 ACCCT

Then run the command again on that file.

ADD REPLY • link 5.4 years ago by Michael 55k

0

Entering edit mode

Hi Changing to .fasta does not solve the problem.

  ls -lh genes_blast.fasta: 63k
  wc genes_blast.fasta:1044 1268 63783
  head genes_blast.fsa: 
 >NM_131667.1 Danio rerio GTP cyclohydrolase 2 (gch2), mRNA
GAGTCAGCTCCACGACGATCAACAGGCTACCCAAGCACCGGCTGCAGTTCTGAAGCAACA
TCTGCTCGACTTCCAATATAAATAACAGGCTTGAAATTATTATTATCTTCTAAATAGTCG
ATCATTAGTCAGTATGGAATACCAAAAGGCAGCAGAACTGAACAGTTTGTGCAATGGCAA
AATCGTCACAGAGTATCTCTGCCGCAATGGCTTTAGCGACCTGACGGTCGACACGAAAAA
AGTCGCTGTCCAGCACAAAAACGAGACATCCCGGAAAGAGGAGGAGGATGAGTCGCGGTT
ACCTGCTCTGGAGGCGGCATACACCACTATACTGCGTGGACTGGGGGAAAACACCGACCG
ACAGGGTCTCCTCAAAACCCCTCTCCGTGCTGCCAAAGCCATGCAGTTTCTGACTAAGGG
ATACCACGAGACCATCTACGATATCCTTAACGATGCCATATTTGATGAAGACCATGAAGA
GCTAGTCATTGTGAAAGACATTGACATGTTTTCACTTTGTGAACATCATCTAGTACCATT

ADD REPLY • link 5.4 years ago by minicola ▴ 20

0

Entering edit mode

Sorry, then the problem might be irreproducible because that looks like a small fasta file (does it have > at the beginning of each fasta header?) So that means something else might be wrong. If you want you can upload the input file somewhere (github, pastebin, etc. ) and we can have a try. Otherwise you need to contact NCBI supprt.

ADD REPLY • link 5.4 years ago by Michael 55k

0

Entering edit mode

I tried with just one sequence and even then I get the problem. I tried running in windows cmd and using R but always have the same problem.

ADD REPLY • link 5.4 years ago by minicola ▴ 20

0

Entering edit mode

Try reinstalling the latest version of the blast binaries for windows. Makeblastdb should be very stable and is used by many thousands regularly. So most likely the error is on your side, possibly in your (windows?) setup. I am sorry, but don't think we can solve this problem here.

ADD REPLY • link 5.4 years ago by Michael 55k

0

Entering edit mode

minicola : Your fasta files do not appear to be in correct format.

It looks like they are missing a > at beginning of the fasta header. Is that correct?

ADD REPLY • link 5.4 years ago by GenoMax 151k

0

Entering edit mode

Sorry, that is a copy-paste error the ">" dissappeared when copying into text-field.

ADD REPLY • link 5.4 years ago by minicola ▴ 20

0

Entering edit mode

Hello, im getting the very same error. Tested on two computers with different systems. Using the most recent version of BLAST from NCBI download page. Seems like a bug on their side. C.

ADD REPLY • link 5.4 years ago by Caya ▴ 80

0

Entering edit mode

Are you using windows OS? I don't see this problem on linux with latest blast+ v.2.10.0.

ADD REPLY • link 5.4 years ago by GenoMax 151k

0

Entering edit mode

Yes, windows on both PCs. Different versions though.

ADD REPLY • link 5.4 years ago by Caya ▴ 80

0

Entering edit mode

If you are sure this is a bug then please report it to NCBI help desk. Beware that it may take them 2-3 business days to respond. It is also end-of-year now which may increase that time.

ADD REPLY • link 5.4 years ago by GenoMax 151k

0

Entering edit mode

This seems to be a Win 10 - specific bug. I still have Windows 8 and latest blast works fine on my computer, but my wife with Windows 10 couldn't get it to work. Old version did the trick though.

ADD REPLY • link 5.1 years ago by predeus ★ 2.1k

0

Entering edit mode

Sorry to hit up an old thread, but I have the same issue when running jobs on a linux cluster. I'm running makeblastdb jobs on a subset of genes from 1 microbial genome, and I'm providing up to 288 Gb of memory per job, but I still get the error: "makeblastdb -in internal_input.fasta -out internal_input.fasta -dbtype prot" -> Error: mdb_env_open: Cannot allocate memory

It appears that BLASTDB_LMDB_MAP_SIZE is only used for windows, and maybe that was only for older versions. I'm using blast 2.10.1 bioconda. I'm using that version of blast because it's a dependency for antismash 5.1.2

Any idea what could be causing the memory allocation issue? I also checked the qacct job logs, and the max memory hit by the jobs is < 0.5Gb (prior to them throwing an error).

ADD REPLY • link 4.5 years ago by nyoungb2 ▴ 10

0

Entering edit mode

I think that is a different error. It is best to contact NLM support directly.

ADD REPLY • link 4.5 years ago by Michael 55k

0

Entering edit mode

4.2 years ago

alowi33 ▴ 50

export BLASTDB_LMDB_MAP_SIZE=100000000 worked for me. I had the same issue on my Linux machine, even with the newer version blast 2.11.0.

Another simple, not ideal, workaround: indicate you want to use the previous version of BLAST database to be created by adding the parameter -blastdb_version 4 to the command. Default is '5'.

ADD COMMENT • link 4.1 years ago by alowi33 ▴ 50

score 8 · Accepted Answer · 2020-01-05

8

Entering edit mode

5.4 years ago

Caya ▴ 80

Response from NLM support solved the issues. Here it is.

Thank you for the report. This is a known issue with the Windows release. The program makeblastdb attempts to allocate a very large amount of virtual memory. You can solve the problem by setting the a new BLAST environment variable BLASTDB_LMDB_MAP_SIZE=1000000 See the BLAST setup documentation for details on how to set Windows environment variables (https://www.ncbi.nlm.nih.gov/books/NBK52637/). Once you change the variable, you'll need to close and reopen the command window where you were running BLAST for the new setting to take effect.

ADD COMMENT • link 5.4 years ago by Caya ▴ 80

1

Entering edit mode

Thanks for adding this - it solved the problem for me.

It took me a few goes to work out how to format the new environmental variable (haven't done anything like this before). I created a new 'User Variable'. The variable name was "BLASTDB_LMDB_MAP_SIZE" and the Value was "1000000". I was initially adding all the text you included (BLASTDB_LMDB_MAP_SIZE=1000000), which threw an error.