makeblastdb file file size question.
1
1
Entering edit mode
4.6 years ago

Hi,

I made a blastdb using the makeblastdb command using 901 sequences. and I did not notice this at first but the file size for the .ntf and .ndb files was HUGE. Both of these were 300gb size each and it was taking the majority of the remaining space of my poor hard drive.

I deleted them and remade the database using -max_file_sz option set to 1GB. But it left me wondering, what is the use of these files, would having a larger version of this be benefitial in any way? was it an error in the first place that made them that large?

I hope someone can enlighten me on these questions.

Thanks, Julian

blast makeblastdb • 3.6k views
ADD COMMENT
0
Entering edit mode

how long were the sequences?

At first glance, I can't see any reason for the databases to be that size. You can download the entirety of the nr database for less than about half that if memory serves (pun intended).

ADD REPLY
0
Entering edit mode

Strange, must be something wrong with my blast installation I guess.

ADD REPLY
0
Entering edit mode

Hi Julian.dekker, I am having the same issues, in my case makeblastdb is creating huge 300GB pdb and ptf files from a small fasta file containing 425 sequences... However, setting a limit for the file size with -max_file_sz does not work in my case... so i was wondering whether you got to the root of the problem or if reinstallation of blast+ helped?

Kind regards, Joscha

ADD REPLY
0
Entering edit mode
4.2 years ago
felix • 0

I just had the same problem at the latest version 2.10.1+. makeblastdb created a database of ~600GB out of a 21MB fasta file! I only noticed it because I failed to make a new database file as windows told my hdd was full...

Setting the max_file_sz parameter didn't solve the problem for me either (it should be set to 1 GB by default according to the documentation). I then used parameter "-blastdb_version 4" and this reduced the file size to only 45MB.

Maybe this is a bug or even a "feature", according to the last paragraph in https://www.ncbi.nlm.nih.gov/books/NBK279688/ this might be related to the virtual memory they create since the 2.10.0 release.

ADD COMMENT
0
Entering edit mode

Yes I was also wondering about the virtual memory, but didn't know that was related to the db version! Thank you, I'll check that out!

ADD REPLY

Login before adding your answer.

Traffic: 1547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6