I'm trying to install blast locally on my computer (running Ubuntu 20.04 on Windows 10 WSL).
Blast was installed in my C drive (as the path in the error message below shows), but I'm trying to download the nr database on an external hard F drive (since my C drive doesn't have enough space).
The NCBI manual suggested using the update_blastdb.pl
script to download the database, so I tried this first:
cd /mnt/f/NCBI/database
update_blastdb.pl --decompress nr [*]
But this gave me an error like below:
Connected to NCBI
[*] not found, skippingblast.
Downloading nr (60 volumes) ...
Downloading nr.00.tar.gz...Unable to close datastream at mnt/c/Program Files/NCBI/blast-2-12.0+/bin/update_blastdb.pl line 444.
Failed to download nr.00.tar.gz.md5!
I found that this is sometimes related to the network speed so I also tried adding the --passive no
option but it returned the same error. Could my installation path have been an issue, or is this purely a network problem? Is there anything else I can do to download the database?
Try manually downloading from the FTP site or website (https://ftp.ncbi.nlm.nih.gov/blast/db/).
BTW, the size is big, and searching against the database in an external drive would be very slow.
Ok, thanks for the link and for the comment about the speed, I don't know if there's an easy solution for the C drive space and will need to think about it. Also, I couldn't find out exactly what files would be downloaded with the
update_blastdb.pl
script. Is it just 60 files that are formatted likenr.00.tar.gz
or do I need the 60nr.00.tar.gz.md5
files too?The
.md5
file is used to check file integrity, which is recommended to be downloaded.You may use a workstation or a server with enough hard drive space and main memory to run BLAST against
nr
. Personal computer is not recommended.