Hi,
I am trying to BLAST some sequences against all the virus and prokaryote genomes available in NCBI. I found out that there are existing BLAST databases for both of them that can be downloaded from BLAST (https://ftp.ncbi.nlm.nih.gov/blast/db/) I downloaded the virus BLAST database (ref_viruses_rep_genomes) and expanded the compressed file. The resultant folder looks like a legitimate BLAST database, with .nb, .nhr, .nin, .nnd etc files.
When I try to use this database like
blastn -query /.../C1_animal_spacers_nonredundant.fasta -db /Volumes/bam/DRG/PK/annotations/ref_viruses_rep_genomes -outfmt 6 -out /.../spacerBLASTresults/C1_animal_spacer_BLAST_hits.txt
it gives me the error -
BLAST Database error: No alias or index file found for nucleotide database [/Volumes/bam/DRG/PK/annotations/ref_viruses_rep_genomes] in search path [/.../C3_human::]
What may be going wrong? Do I need to somehow let my machine know that this database exists now? Also, the 'search path' that is part of the error message is different from either my query location or database location. Not sure whats the reason for that either.
what I sometimes do to resolve this is set the BLAST_DB env variable. Don't ask me why but it's possible it fixes this.
so do something like
set BLAST_DB = '/Volumes/bam/DRG/PK/annotations/'
on the commandline or in your script.(you might need to look up the correct name of that variable :/ )
That is a great suggestion. Variable is actually called
BLASTDB
soexport BLASTDB=/Volumes/bam/DRG/PK/annotations/
.That path specification looks odd (are you obfuscating real path with
...
, if so that is ok). Are there any spaces in those paths?Yes, I was obfuscating the real path with
...
. I have now added the real paths. I don't think there should be any issue with themDo you have these files in the database folder?
You can keep the paths obfuscated. That detail is not required here as long as we know that the paths are obfuscated.
Yup, every one of them!
I can tell you that the database files at NCBI are complete and they work fine.
Are you using an old version of
blastn
by chance? Database files available at NCBI are in v.5 format and then needblast+
v. 2.9 (or 2.10) and up. What do you get with| => blastn -version
blastn: 2.10.0+
Package: blast 2.10.0, build Jan 8 2020 22:00:44
This should work without issues. Do you get any other messages besides the error above?
No, this is the only issue I get. Something I just noticed is that when it says
No alias or index file found for nucleotide database [/Volumes/bam/DRG/PK/annotations/ref_viruses_rep_genomes] in search path XYZ
, the XYZ is the current working directory in the terminal. Not sure if its relevant, but the error continues to show up!