Question

blastn_vdb "- Error opening the following db(s): "

0

Entering edit mode

7.8 years ago

musta1234 ▴ 30

Background:

I used blastn_vdb, part of the sra toolkit, to blast a set of genomes on the ncbi wgs database following the instructions provided at: ftp://ftp.ncbi.nlm.nih.gov/blast/WGS_TOOLS/README_BLASTWGS.txt I attempted running blastn_vdb using a nucleotide fasta query (4 sequences, ~1kb each) on all gammaproteobacteria WGS sequences (24,507 wgs genomes). blastn_vdb downloaded 24040 genomes (XXX00N.cache) and 439 other XXX00N files without .cache extension before exiting with the following error messages. I have more than enough available disk space for wgs and blast output folders.

Commands:

taxid2wgs.pl -title "Gammaprot" -alias_file Gammaprot 1236
blastn_vdb -query ../queries/queries.fna -db Gammaprot -outfmt '6 std qlen slen' -out Gammaprot_v_queries.txt

Error message:

NCBI C++ Exception:
T0 "/export/home/tomcat/TeamCity/Agent3/work/9b0ff710cbae2b8c/blastn/c++/src/internal/blast/vdb/vdb2blast_util.cpp", line 266: Error: ncbi::CVDBBlastUtil::x_MakeSRASeqSrc() - Error opening the following db(s): MPCP01

This error is reported for 100s of wgs IDs and I checked a few of these wgs IDs and found intact sequences on ncbi web interface.

Any help or hint is appreciated.

sra sra-toolkit blast blastn_vdb wgs • 3.0k views

ADD COMMENT • link updated 5 months ago by cduvallet • 0 • written 7.8 years ago by musta1234 ▴ 30

0

Entering edit mode

Hi @musta1234, did you find a way to get this to work?

ADD REPLY • link 2.6 years ago by bioinfo17 ▴ 30

score 0 · Answer 1 · 2024-06-10

I encountered a similar error and emailed NCBI and got the following response:

WGS datasets are numerous, they are stored within the same system as the SRA datasets with even larger number of volumes. Some lookup and retrieval for blast need will fail, which often is a reflection of the server status for that SRA retrieval system, there is nothing wrong with the db volume reported back. It is unfortunate that this will break the blast search. You may want to consider break up the alias into smaller subsets manually and search each separately.

So it seems that the retrieval is expected to fail sometimes even for db ID's that do exist. Maybe there is a workaround or maybe this behavior will be updated soon, but in case it isn't I hope this is helpful for future folks!