How to use a downloaded BLAST database
0
0
Entering edit mode
3.1 years ago
c_u ▴ 520

Hi,

I am trying to BLAST some sequences against all the virus and prokaryote genomes available in NCBI. I found out that there are existing BLAST databases for both of them that can be downloaded from BLAST (https://ftp.ncbi.nlm.nih.gov/blast/db/) I downloaded the virus BLAST database (ref_viruses_rep_genomes) and expanded the compressed file. The resultant folder looks like a legitimate BLAST database, with .nb, .nhr, .nin, .nnd etc files.

When I try to use this database like

blastn -query /.../C1_animal_spacers_nonredundant.fasta -db /Volumes/bam/DRG/PK/annotations/ref_viruses_rep_genomes -outfmt 6 -out /.../spacerBLASTresults/C1_animal_spacer_BLAST_hits.txt

it gives me the error -

BLAST Database error: No alias or index file found for nucleotide database [/Volumes/bam/DRG/PK/annotations/ref_viruses_rep_genomes] in search path [/.../C3_human::]

What may be going wrong? Do I need to somehow let my machine know that this database exists now? Also, the 'search path' that is part of the error message is different from either my query location or database location. Not sure whats the reason for that either.

blast • 3.4k views
ADD COMMENT
2
Entering edit mode

what I sometimes do to resolve this is set the BLAST_DB env variable. Don't ask me why but it's possible it fixes this.

so do something like set BLAST_DB = '/Volumes/bam/DRG/PK/annotations/' on the commandline or in your script.

(you might need to look up the correct name of that variable :/ )

ADD REPLY
1
Entering edit mode

That is a great suggestion. Variable is actually called BLASTDB so export BLASTDB=/Volumes/bam/DRG/PK/annotations/.

ADD REPLY
0
Entering edit mode

That path specification looks odd (are you obfuscating real path with ..., if so that is ok). Are there any spaces in those paths?

ADD REPLY
0
Entering edit mode

Yes, I was obfuscating the real path with .... I have now added the real paths. I don't think there should be any issue with them

ADD REPLY
0
Entering edit mode

Do you have these files in the database folder?

-rw-r--r-- blastadm/blast 176276 2021-10-11 23:17 ref_viruses_rep_genomes.nin
-rw-r--r-- blastadm/blast 2092879 2021-10-11 23:17 ref_viruses_rep_genomes.nhr
-rw-r--r-- blastadm/blast 114478953 2021-10-11 23:17 ref_viruses_rep_genomes.nsq
-rw-r--r-- blastadm/blast       508 2021-10-11 23:17 ref_viruses_rep_genomes.nni
-rw-r--r-- blastadm/blast    117424 2021-10-11 23:17 ref_viruses_rep_genomes.nnd
-rw-r--r-- blastadm/blast     58744 2021-10-11 23:17 ref_viruses_rep_genomes.nog
-rw-r--r-- blastadm/blast    778240 2021-10-11 23:17 ref_viruses_rep_genomes.ndb
-rw-r--r-- blastadm/blast    293568 2021-10-11 23:17 ref_viruses_rep_genomes.nos
-rw-r--r-- blastadm/blast    176144 2021-10-11 23:17 ref_viruses_rep_genomes.not
-rw-r--r-- blastadm/blast    380928 2021-10-11 23:17 ref_viruses_rep_genomes.ntf
-rw-r--r-- blastadm/blast    104772 2021-10-11 23:17 ref_viruses_rep_genomes.nto
-rw-rw-r-- blastadm/blast 153100208 2021-10-15 09:55 taxdb.btd
-rw-rw-r-- blastadm/blast  16210768 2021-10-15 09:55 taxdb.bti

You can keep the paths obfuscated. That detail is not required here as long as we know that the paths are obfuscated.

ADD REPLY
0
Entering edit mode

Yup, every one of them!

ADD REPLY
1
Entering edit mode

I can tell you that the database files at NCBI are complete and they work fine.

Are you using an old version of blastn by chance? Database files available at NCBI are in v.5 format and then need blast+ v. 2.9 (or 2.10) and up. What do you get with

$ blastn -version
blastn: 2.12.0+
 Package: blast 2.12.0, build Jul 19 2021 09:26:29
ADD REPLY
0
Entering edit mode

| => blastn -version

blastn: 2.10.0+

Package: blast 2.10.0, build Jan 8 2020 22:00:44

ADD REPLY
1
Entering edit mode

This should work without issues. Do you get any other messages besides the error above?

ADD REPLY
0
Entering edit mode

No, this is the only issue I get. Something I just noticed is that when it says No alias or index file found for nucleotide database [/Volumes/bam/DRG/PK/annotations/ref_viruses_rep_genomes] in search path XYZ, the XYZ is the current working directory in the terminal. Not sure if its relevant, but the error continues to show up!

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6