How to create 10GB blast database
1
0
Entering edit mode
10.3 years ago
arronar ▴ 290

Hello.

I want to create a database for local blast but fasta file with reads is 9Gb. So I want from this file to create only one database for the blast and not many files. So I put -max_file_sz argument as you can see above, but it creates many files of 2.1GB. How can i create only one database file? Should I merge them later?

Thank you.

Here is the command I use.

makeblastdb -in /mnt/usb/sra/merged.fasta -max_file_sz '10GB' -dbtype nucl -out /mnt/usb/sra/merged_db
blast • 4.8k views
ADD COMMENT
1
Entering edit mode
10.3 years ago
Zhaorong ★ 1.4k

I may be wrong, but it seems "-max_file_sz" does not allow a size greater than 2GB.

Checkout makeblastdb source code line #1122-1126:

Uint8 bytes = NStr::StringToUInt8_DataSize(args["max_file_sz"].AsString());

if (bytes >= (1L << 31)) {

    NCBI_THROW(CInvalidDataException, eInvalidInput, "max_file_sz must be < 2 GiB");

}

Links:

  1. Uint8
  2. NStr
  3. StringToUInt8_DataSize
  4. AsString
  5. bytes
  6. NCBI_THROW
  7. CInvalidDataException

I think an alternative is to use the blastdb_aliastool that comes with NCBI BLAST+. Please refer to the "Aggregate existing BLAST databases" section of BLAST Command Line Applications User Manual. I copied it below for easier access.

To combine the two nematode nucleotide databases, named "nematode_mrna" and "nematode_genomic", we use the following command line:

$ blastdb_aliastool -dblist "nematode_mrna nematode_genomic" -dbtype nucl \
  -out nematode_all -title "Nematode RefSeq mRNA + Genomic"
ADD COMMENT
0
Entering edit mode

Thank you but this doesn't really merge the databases. Instead, it creates a virtual database in a .nal file.

So is there any way to merge .nhr , .nin and .nsq files to have a single only database?

ADD REPLY

Login before adding your answer.

Traffic: 1978 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6