Hi all, I'm actually trying to use the nr database from blast and add some taxonomic informations into my blat output.
So, I actually downloaded and uncompressed :
-taxcat.zip
-taxdump.tar.gz
-prot.accession2taxid.gz
-the nr database (huge file)
And i put all of them into one folder named blast_database
Then, i changed my BLASTDB path as:
export BLASTDB=/pandata/me/LEPIWASP/blast_database
and when I want to generate the gi_to_des.tab databse by doing: blastdbcmd -entry 'all' -db nr > nr.faa
I actually get:
BLAST Database error: No alias or index file found for nucleotide database [nr]
Does someone have an idea where is my mistake?
The nr file it however in the directory blast_database
I do not understand.
Here are the files inside my directory:
total 106892000
-rw-r--r-- 1 16783992 May 11 12:20 citations.dmp
-rw-r--r-- 1 3568599 May 11 12:20 delnodes.dmp
-rw-r--r-- 1 442 May 11 12:20 division.dmp
-rw-r--r-- 1 15188 May 11 12:20 gc.prt
-rw-r--r-- 1 4575 May 11 12:20 gencode.dmp
-rw-r--r-- 1 919089 May 11 12:20 merged.dmp
-rw-r--r-- 1 154534803 May 11 12:20 names.dmp
-rw-r--r-- 1 119658024 May 11 12:20 nodes.dmp
-rw-r--r-- 1 93133265049 May 10 19:38 nr
-rw-r--r-- 1 0 May 11 13:31 nr.faa
-rw-r--r-- 1 3766079372 May 11 13:31 prot.accession2taxid.gz
-rw-r--r-- 1 58 May 11 13:10 prot.accession2taxid.gz.md5
-rw-r----- 1 2652 Jun 13 2006 readme.txt
-rw-r--r-- 1 6766010 May 11 13:08 taxcat.zip
-rw-r--r-- 1 43086159 May 11 13:09 taxdump.tar.gz
output asked: for grep "^>" nr | head -3
>WP_003131952.1 30S ribosomal protein S18 [Lactococcus lactis]NP_268346.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis Il1403]Q9CDN0.1 RecName: Full=30S ribosomal protein S18Q02VU1.1 RecName: Full=30S ribosomal protein S18A2RNZ2.1 RecName: Full=30S ribosomal protein S18AAK06287.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis Il1403]ABJ73931.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris SK11]CAL99037.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris MG1363]ADA65983.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis KF147]ADJ61439.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris NZ9000]ADZ64834.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis CV56]EHE92602.1 hypothetical protein LLCRE1631_01913 [Lactococcus lactis subsp. lactis CNCM I-1631]AEU41715.1 SSU ribosomal protein S18p [Lactococcus lactis subsp. cremoris A76]BAL52156.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis IO-1]AFW92578.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris UC509.9]CDG05746.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis A12]EQC53187.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis str. TIFN4]EQC53393.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis str. TIFN2]EQC54683.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN6]EQC56744.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN5]EQC82878.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN7]EQC91162.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN1]EQC94448.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris TIFN3]AGV74185.1 ribosomal protein S18 RpsR [Lactococcus lactis subsp. cremoris KW2]AGY45032.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis KLDS 4.0325]ESK79551.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis str. LD61]KEY61992.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris GE214]AII13743.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis NCDO 2118]KGF77556.1 SSU ribosomal protein S18p SSU ribosomal protein S18p, zinc-independent [Lactococcus lactis]AIS04718.1 SSU ribosomal protein S18P [Lactococcus lactis]KGH32949.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris]KHE77803.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis 1AA59]KKW69436.1 ribosomal protein bS18, rpsR [Lactococcus lactis subsp. cremoris]KKW70341.1 ribosomal protein bS18, rpsR [Lactococcus lactis subsp. cremoris]KLK95226.1 ribosomal protein bS18, rpsR [Lactococcus lactis subsp. lactis]KRO21588.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]KST41693.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis]KST76534.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST79241.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST81638.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST85642.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST88531.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST92921.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST97154.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST98471.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KST99285.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU03686.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU05991.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU09388.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU13881.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU20925.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU23615.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU25349.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU27070.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU28321.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KSU32404.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis]KZK07251.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK08880.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. lactis bv. diacetylactis]KZK09361.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK33282.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK44117.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK46962.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK52810.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]KZK53814.1 SSU ribosomal protein S18p SSU ribosomal protein S18p zinc-independent [Lactococcus lactis subsp. cremoris]OAJ97698.1 30S ribosomal protein S18 [Lactococcus lactis]OAZ16676.1 30S ribosomal protein S18 [Lactococcus lactis RTB018]SBW31684.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]OEU38668.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris IBB477]OJH46247.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis]ONK31551.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]ARD92294.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARD97280.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARD99957.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE04690.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE06709.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARE09571.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE12078.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE14468.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE16888.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE19344.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARE21948.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. lactis]ARE24261.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]ARE27001.1 SSU ribosomal protein S18P [Lactococcus lactis subsp. cremoris]OSP86582.1 30S ribosomal protein S18 [Lactococcus lactis]ARR87601.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis bv. diacetylactis]PAK66121.1 30S ribosomal protein S18 [Lactococcus lactis]PAK87984.1 30S ribosomal protein S18 [Lactococcus lactis]PAL02283.1 30S ribosomal protein S18 [Lactococcus lactis]PCS13431.1 30S ribosomal protein S18 [Lactococcus lactis subsp. hordniae]PCS17241.1 30S ribosomal protein S18 [Lactococcus lactis subsp. tructae]PEN18002.1 30S ribosomal protein S18 [Lactococcus lactis]PFG75654.1 30S ribosomal protein S18 [Lactococcus lactis]PFG79860.1 30S ribosomal protein S18 [Lactococcus lactis]PFG84386.1 30S ribosomal protein S18 [Lactococcus lactis]PFG87566.1 30S ribosomal protein S18 [Lactococcus lactis]PFG90835.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris]PFG90892.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]ATY88684.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]ATZ02303.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]PLW60021.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]AUS70574.1 30S ribosomal protein S18 [Lactococcus lactis subsp. lactis]PPA66113.1 30S ribosomal protein S18 [Lactococcus lactis]BBC75095.1 30S ribosomal protein S18 [Lactococcus lactis subsp. cremoris]
>XP_642131.1 hypothetical protein DDB_G0277827 [Dictyostelium discoideum AX4]P54670.1 RecName: Full=Calfumirin-1; Short=CAF-1BAA06266.1 calfumirin-1 [Dictyostelium discoideum AX2]EAL68086.1 hypothetical protein DDB_G0277827 [Dictyostelium discoideum AX4]
>XP_642837.1 hypothetical protein DDB_G0276911 [Dictyostelium discoideum AX4]EAL68957.1 hypothetical protein DDB_G0276911 [Dictyostelium discoideum AX4]
I downloaded the nr file from :
Is it the good one?
That is not the blast index. It is the fasta format sequences file for
nr
. If you need that then there is no need to useblastdbcmd
.In fact here is the tutorial for what I need to do :
So, I do not need to do all these things if I dowloaded the huge nr file?
I would think so.
That said, if you want to follow the tutorial exactly then you should download all nr.tar.gz files from
db
directory (ftp://ftp.ncbi.nih.gov/blast/db
).Can you post the output of
grep "^>" nr | head -3
? I want to compare what the headers look like in that fasta file with nr blast index.yep sure, I wrote it in my first comment
That looks identical to what I get from
blastdbcmd -entry 'all' -db nr
. So you should be good to go to next step in your workflow. May want to renamenr
tonr.faa
if that file name is expected.If whatever you are trying to do needs the
nr
blast indexes then you would need to download them from the link in one of the comments above.