Core_nt database not available to download?
1
0
Entering edit mode
9 weeks ago
DNAngel ▴ 250

Hi all,

Anyone have any luck downloading the new NCBI nt database core_nt?

I had the latest nt databse before and I seriously messed up when I wanted to update it yester day with ./update_blastdb.pl --decompress nt only to have it drop from 191 nt files to 157. I realized this is because they changed the files around and what I want is core_nt but ./update_blastdb.pl --decompress core_nt keeps saying:

Warning: No BLASTDB metadata for 
core_nt core_nt not found, skipping.

I need the latest database to run my stuff so how do I get this database??

UPDATE!!

As of August 30, 2024 running ./update_blastdb.pl --decompress core_nt is now working! In case anyone else is interested. I did try downloading them manually with no issues, uncompressed them all, but was having issues checking the integrity of the files. But the original update command is working!

Currently redownloading the files and so far no issues reported....will update later if the download was truly successful.

blast ncbi • 742 views
ADD COMMENT
0
Entering edit mode
9 weeks ago
GenoMax 146k

Get the files directly from https://ftp.ncbi.nih.gov/blast/db/ . NCBI may need to update the perl script to allow for core_nt downloads via that script.

core_nt is a subset of nt and has the following content

Core_nt contains the same eukaryotic transcript and gene-related sequences as nt. The core_nt database is nt without most eukaryotic chromosome sequences.

ADD COMMENT
0
Entering edit mode

is it just wget ttps://ftp.ncbi.nih.gov/blast/db/core_nt*.gz

Do I need to do anything else manually to ensure this database is downloaded properly and can just call it simply as core_nt? I've only ever used the perl script before.

ADD REPLY
1
Entering edit mode

Get all core_nt* files and that should be all you need. Currently there are 65 files so adjust accordingly (for any future visitors. Also may want to get the ,md5 sums, if needed).

wget https://ftp.ncbi.nih.gov/blast/db/core_nt.{00..64}.tar.gz

Uncompress in a folder and it should work as core_nt. Use blastdbcheck utility included in the blast+ package to check integrity.

$ blastdbcheck -db core_nt
Writing messages to <stdout> at verbosity (Summary)
ISAM testing is ENABLED.
Legacy testing is DISABLED.
TaxID testing is DISABLED.
By default, testing 200 randomly sampled OIDs.

Testing 65 volume(s).
 Result=SUCCESS. No errors reported for 65 volume(s).
Testing 1 alias(es).
 Result=SUCCESS. No errors reported for 1 alias(es).
ADD REPLY
0
Entering edit mode

I tried to uncompress them as above but it's just giving an error such as:

tar: core_nt.64.tar.gz: Not found in archive

and this is coming up for every file...however, I do have all of these files in my folder.

ADD REPLY
0
Entering edit mode

I downoaded one of the files and had no issue uncompressing the file. What OS are you working on? What do you see when you do

$ file core_nt.01.tar.gz
core_nt.01.tar.gz: gzip compressed data, last modified: Mon Aug 26 09:09:28 2024, from Unix,
ADD REPLY
0
Entering edit mode

It was an error in the way I typed it but everything is uncompressed in a blastdb folder. All 64 files are there. However I'm getting an error still:

cd blastdb 
blastdbcheck -db core_nt

ISAM testing is ENABLED. Legacy testing is DISABLED. TaxID testing is DISABLED. By default, testing 200 randomly sampled OIDs.

[ERROR] could not find all volume or alias files referenced in ./core_nt, [skipped] Testing 0 volume(s). Result=SUCCESS. No errors reported for 0 volume(s). Testing 0 alias(es). Result=SUCCESS. No errors reported for 0 alias(es).

ADD REPLY
0
Entering edit mode

My apologies. It looks like there is a file called core_nt.00.tar.gz which we originally missed in the code above. Please get that manually (wget https://ftp.ncbi.nih.gov/blast/db/core_nt.00.tar.gz ). I will edit my code above to include that file now. (we originally started loop from 01 file instead of 00).

ADD REPLY

Login before adding your answer.

Traffic: 2347 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6