I want to use DIAMOND for my metagenome Functional analysis. As the instruction, I have to download NCBI nr database (ftp://ftp.ncbi.nlm.nih.gov/blast/db/). Unfortunately, my internet connection is not very stable, so I have to download a multiple nr file nr.**.tar.gz instead of a nr single gz file using these code:
After that, I got a lot of file in my output directory (~180Gb). I wonder how I can combine all these file into a single nr.faa just like in the DIAMOND manual.
DIAMOND needs its own database, it does not work with blast databases - which is what you are downloading. You have to download the NR fasta file, then:
wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
diamond makedb --in nr.gz -d nr
Edit at 2022/11/08
Since DIAMOND version 2.0.8, DIAMOND can use original BLAST databases. One has to call diamond prepdb before using native BLAST databeses.
Just to add to the information above, DIAMOND has supported BLAST databases since v2.08. However, you need to download the executable from GitHub, as the conda version doesn't support this feature. You can use the following command to install diamond and then prepare the database using diamond prepdb:
wget http://github.com/bbuchfink/diamond/releases/download/v2.1.9/diamond-linux64.tar.gz
tar xzf diamond-linux64.tar.gz
./diamond prepdb -d /path/to/db/nr
The prepdb command above will convert the BLAST nr database and generate the corresponding *.acc file as the diamond database. This will take quite some time (~50 min in my case).
If you use the conda version DIAMOND, it will produce an error:
$ diamond prepdb -d nr
diamond v2.1.9.163 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)
Error: This executable was not compiled with support for BLAST databases.
DIAMOND also needs more RAM than BLAST+. Something to keep in mind.
How long does it usually take to build nr with diamond using all of the taxonomy files?