I am running a local blast server. I can format and blast my own databases. However, I am unsure of how to setup the "Nucleotide collection nr/nt" database from this NCBI Blast URL.
Can I just download a preformatted db and use the update script? Which database is it? Is it just both the nr and nt databases? Isn't blastn used for the nt database and blastp used for the nr database? Can I blast them both at the same time? If so how?
Also, downloading nr downloads two files nr.01.tar.gz and nr.00.tar.gz. Is this right? How can I setup to blast just "nr" rather than "nr.00 nr.01"?
I think your confusion stems from the use of the term "Nucleotide collection nr/nt", on the BLAST page to which you linked.
In that case, "nr/nt" stands for "non-redundant nucleotide." However, as you point out, NCBI also make separate databases available for download. In this case, "nr" is non-redundant protein, "nt" is non-redundant nucleotide.
Yes: you would blastn versus nt and blastp versus nr. No: you cannot BLAST both "at the same time." You need to choose an appropriate combination of BLAST program and database. For example, you can BLAST nucleotide queries against the protein database by using blastx, which first translates the queries in 6 frames.
The 2 files nr.00 and nr.01 simply mean that the database has been split into two parts, because it is very large. Older BLAST versions used an additional index file - it used to be called "nr.pal" and may still be called that. Provided that 00, 01 and the index file all reside in the same location, local BLAST will "stitch" the 2 parts together in the background and you just specify "nr" as the database. Alternatively (since I have not upgraded to BLAST+ myself), it may be that the index file is no longer required.
So I have the same issue except the nt databases are now in 27 parts. I downloaded all of them but cannot extract any of them because there is absolutely no space. I extracted the nt.00 file first and that had a nt.pal file. Is that all I need?
Am I required to download ALL the nt files because I don't see how this is possible given the space requirements.
Actually, the "nr" database has currently 6 parts, so it should be nr.00 to nr.05.
If you have trouble using the update script, you can also download preformated blast databases from the NCBI ftp server
Thanks --- I was wondering too!