Specifying database path in BLAST standalone
1
0
Entering edit mode
3.7 years ago
Hansen_869 ▴ 80

Hello, I am in the midst of downloading the BLASTP (nr) database. However, I have limited space on my PC, so I am downloading it to my local NAS server. How to I point to my own database, when running BLAST jobs? I am running it on my windows PC. Thanks!

blast path standalone • 2.7k views
ADD COMMENT
0
Entering edit mode

Thanks! When I try to run BLAST on my commandline:

blastp -query C:\Users\Myuser\Desktop\Fastas\Myfasta.fasta -db Y:\NAS_Server\BLAST\NR -out result.txt -outfmt 6 -evalue 0.00001 -max_target_seqs 1

I get the following error:

BLAST Database error: No alias or index file found for protein database [Y:\NAS_Server\BLAST\NR] in search path [C:\Users\Myuser;\NAS_Server\BLAST;]

I have downloaded all nr.tar.gz files and extracted them so that they are in folder structures like: nr.00, nr.01, nr.02 etc. should all the files be in the same folder?

ADD REPLY
0
Entering edit mode

Did you download ALL nr* files (there are 41 as of today) from NCBI and uncompress them? All nr* files need to be in the same folder.

ADD REPLY
0
Entering edit mode

I have! And now I receive the following: Input db vol size does not match lmdb vol size. In regards to the nr.pal, taxdb.btd and taxdb.bri files located in each of the 41 NR folders. Do I just need 1 set of those files? Cause the all have the same name (no numbering).

ADD REPLY
0
Entering edit mode

Yes those files are fine. Was there any issue with the download? Did you download via web or the update_blast perl script that NCBI provides?

Can you check to make sure the db downloaded without errors: blastdbcheck -db \path_to\nr?

ADD REPLY
0
Entering edit mode

Via Web, using the update_blast perl kept giving me the error. Something with an error in line 372 in the script, just as it finished downloading nr.00.

So i ran blastdbcheck and got the following: Testing 42 volume(s).

X:\NR\nr.00 / MetaData: [ERROR] caught exception.

X:\NR\nr.01 / MetaData: [ERROR] caught exception.

X:\NR\nr.02 / MetaData: [ERROR] caught exception.

X:\NR\nr.03 / MetaData: [ERROR] caught exception.

X:\NR\nr.04 / MetaData: [ERROR] caught exception.

X:\NR\nr.05 / MetaData: [ERROR] caught exception.

X:\NR\nr.06 / MetaData: [ERROR] caught exception.

X:\NR\nr.07 / MetaData: [ERROR] caught exception.

X:\NR\nr.08 / MetaData: [ERROR] caught exception.

X:\NR\nr.09 / MetaData: [ERROR] caught exception.

X:\NR\nr.10 / MetaData: [ERROR] caught exception.

X:\NR\nr.11 / MetaData: [ERROR] caught exception.

X:\NR\nr.12 / MetaData: [ERROR] caught exception.

X:\NR\nr.13 / MetaData: [ERROR] caught exception.

X:\NR\nr.14 / MetaData: [ERROR] caught exception.

X:\NR\nr.15 / MetaData: [ERROR] caught exception.

X:\NR\nr.16 / MetaData: [ERROR] caught exception.

X:\NR\nr.17 / MetaData: [ERROR] caught exception.

X:\NR\nr.18 / MetaData: [ERROR] caught exception.

X:\NR\nr.19 / MetaData: [ERROR] caught exception.

X:\NR\nr.20 / MetaData: [ERROR] caught exception.

X:\NR\nr.21 / MetaData: [ERROR] caught exception.

X:\NR\nr.22 / MetaData: [ERROR] caught exception.

X:\NR\nr.23 / MetaData: [ERROR] caught exception.

X:\NR\nr.24 / MetaData: [ERROR] caught exception.

X:\NR\nr.25 / MetaData: [ERROR] caught exception.

X:\NR\nr.26 / MetaData: [ERROR] caught exception.

X:\NR\nr.27 / MetaData: [ERROR] caught exception.

X:\NR\nr.28 / MetaData: [ERROR] caught exception.

X:\NR\nr.29 / MetaData: [ERROR] caught exception.

X:\NR\nr.30 / MetaData: [ERROR] caught exception.

X:\NR\nr.31 / MetaData: [ERROR] caught exception.

X:\NR\nr.32 / MetaData: [ERROR] caught exception.

X:\NR\nr.33 / MetaData: [ERROR] caught exception.

X:\NR\nr.34 / MetaData: [ERROR] caught exception.

X:\NR\nr.35 / MetaData: [ERROR] caught exception.

X:\NR\nr.36 / MetaData: [ERROR] caught exception.

X:\NR\nr.37 / MetaData: [ERROR] caught exception.

X:\NR\nr.38 / MetaData: [ERROR] caught exception.

[ERROR] caught exception in X:\NR\nr.39

[ERROR] caught exception in X:\NR\nr.40

[ERROR] caught exception in X:\NR\nr.41

Result=FAILURE. 42 errors reported in 42 volume(s). Testing 1 alias(es). X:\NR\nr.pal / AliasFileTest: [ERROR] caught exception in initializing blastdb Result=FAILURE. 1 errors reported in 1 alias(es).

Total errors: 81

ADD REPLY
0
Entering edit mode

That is not good news. Looks like you must have run into some problem.

You should have seen something like this.

$ blastdbcheck -db nr -dbtype prot
Writing messages to <stdout> at verbosity (Summary)
ISAM testing is ENABLED.
Legacy testing is DISABLED.
TaxID testing is DISABLED.
By default, testing 200 randomly sampled OIDs.

Testing 42 volume(s).
 Result=SUCCESS. No errors reported for 42 volume(s).
Testing 1 alias(es).
 Result=SUCCESS. No errors reported for 1 alias(es).
ADD REPLY
0
Entering edit mode

Yeah, no clue what to do from here. Running with -verbosity 4 i get the following for all instances: T0 "e:\remoteapp\tmp64_17\32_1601968763\c++\src\objtools\blast\seqdb_reader\seqdbimpl.cpp", line 1638: Error: (CSeqDBException::eArgErr) BLASTDB::ncbi::CSeqDBImpl::GetTaxInfo() - Taxid 9606 not found

ADD REPLY
0
Entering edit mode

You will have to re-download the data, if there was corruption of some kind. Be aware of firewall restrictions in your local environment. Download data as binary. Looks like there should be about 260GB of total data for nr.

ADD REPLY
0
Entering edit mode

Yeah I'll do that. How do I download as binary?

ADD REPLY
0
Entering edit mode

I mentioned that in case you were going to use a FTP client. If you are simply using a browser then saving the files locally that should save them in right format.

ADD REPLY
0
Entering edit mode

Awesome, still decompressing. When using the md5 on the files, all of them return different numbers. I even tried downloading the taxdb (40mB) file from BLAST and the md5 still doesn't match. Isn't it very unlikely that all files are corrupt?

ADD REPLY
0
Entering edit mode

So the md5 values in those .md5 files from NCBI don't match what you generate locally before uncompressing that file? I don't know if Windows in the equation is causing this but that does make it sound like there is something wrong.

Are you downloading at work? Is there a local firewall involved? I wonder if some inline packet inspection program is causing this corruption.

ADD REPLY
0
Entering edit mode

Yeah the values generated are totally different, for all of them. I'm only downloading at home and only my windows firewall is active. Could that really be the issue?

ADD REPLY
0
Entering edit mode

That is really odd. If database does not pass the blastcheck then you are likely not going to be able to use it.

ADD REPLY
0
Entering edit mode

Weird thing, but since I downloaded it again, it works! No idea what was wrong. Only issue now is that it takes about 2 hours to complete 1 single search.

ADD REPLY
0
Entering edit mode

Good to know. I hope you are using multiple threads (depending on how many you have for your CPU) that will provide some speed bump.

 -num_threads <Integer, >=1>
   Number of threads (CPUs) to use in the BLAST search
ADD REPLY
0
Entering edit mode

I'll give it a go and report back, thanks!

ADD REPLY
0
Entering edit mode
3.7 years ago
GenoMax 147k

You would just use the mount point on your local PC. Since you are using Windows e.g. If your NAS is mounted as Y drive then you would use -db Y:\path_to_blast_database_folder_including_any_intermediate_folders\nr (to use nr database).

If you are using unix then it would be -db /mnt/path_to_blast_database_folder/nr .

ADD COMMENT
0
Entering edit mode

Thanks, I will try that out! Another question, do I have to unzip just the GZ or also the TAR when downloading the database files?

ADD REPLY
0
Entering edit mode

Yes you have to uncompress the files. Make sure all files are there for large databases like nt/nr.

ADD REPLY
0
Entering edit mode

Alright, thanks! Do you know why they are double-compressed? With both GZ and TAR.

ADD REPLY
0
Entering edit mode

TAR is not really a compressor - it only bundles large number of files into one. Gzip is the only compressor.

ADD REPLY

Login before adding your answer.

Traffic: 1790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6