Kraken Database problem
0
0
Entering edit mode
7.3 years ago

kraken --db MY_DB MYSEQ.fasta (where MY_SEQ.fasta is one sequnce which I downloaded from ncbi, and MY_DB is the downloaded database) Hi everybody, After downloading kraken's database wihth "kraken-build" command, I typed the command:

kraken --db MY_DB MYSEQ.fasta (where MY_SEQ.fasta is one sequnce which I downloaded from ncbi, and MY_DB is the downloaded database)

But I received this error: MY_DB/database.kdb does not exist!

As I read in kraken's mannual, there should be 4 part in every database : database.kdb: Contains the k-mer to taxon mappings database.idx: Contains minimizer offset locations in database.kdb taxonomy/nodes.dmp: Taxonomy tree structure + ranks taxonomy/names.dmp: Taxonomy names

But in the downloaded db, I dont have all of them, and all things that I have are: A librarary folder, containing abouat 2000 subfolders containing nodes.dmp and names.dmp

And a taxonomy folder containing names.dmp and nodes.dmp

can anybody help me plz?

genome • 6.8k views
ADD COMMENT
2
Entering edit mode

Are you using Kraken's pre-built minidb databases? It is not clear what MY_DB represents here. If you are trying to build the standard Kraken using the commands provided in the manual, it might not work because NCBI's ftp structure is changed since the Kraken manual was released. I've written a small script in Python that can build the kraken DB : Update kraken databases

ADD REPLY
0
Entering edit mode

Hi, MY_DB is an empty folder to store the results of command "kraken-build", which downloads standard DB.

Do you mean that I should paste these lines of your code into a document by postfix ".sh" as a bash file and then call this bash file instead of "Kraken-build"? Am I wright? Can you explain that after installing Jellyfish, what should I do?

In other word,after downloading kraken and Jellyfish1.1.11 from github, I followed these steps:

1.I installed kraken with command "install_kraken.sh" 2. I installed Jellyfish with command "Jellyfish --prefix $myaddress", and I did it's configurations. 3. I downloaded the standard database with command "kraken-build --standard jellyfish-hash-size 6400M --db $MY_DB", and it took about 9 hours! 4.I downloaded one sequence from ncbi, and saved it in a file by the name "seq.fasta" 5. I typed "kraken --db $MY_DB seq.fasta" and in this step I received the mentioned error.

can you please clarify me that in which step should I replace your bash file? Thank you veryyyyy much!

ADD REPLY
0
Entering edit mode

Excuse me, I'm waiting for your help please, can you guide me please? Thank you

ADD REPLY
0
Entering edit mode

Apologies but it is not clear to me what you are trying to do, could you please provide a link to the tutorial that you followed an main aim of what you are trying to achieve.

ADD REPLY
0
Entering edit mode

Thank you, I followed the steps in in this address http://ccb.jhu.edu/software/kraken/MANUAL.html . After downloading the database from ncbi, and saving that database in a folder named MY_DB, I typed the command "kraken --db MY_DB MYSEQ.fasta" (where MY_SEQ.fasta is one single sequnce which I downloaded from ncb), by typing this command I wanted to know whether that does the single sequence named MYSEQ.fasta, exist in MY_DB or not?

but in this step I received the error "MY_DB/database.kdb does not exist!".

Now I want you to help me: In which step should I replace your script? (I saved the codes in a file named Krak.sh, for example should I say "krak.sh MY_DB" instead of saying "kraken-build --standard --db MY_DB"? Can you clarify me please?

I do appreciate you again.

ADD REPLY
1
Entering edit mode

I followed the steps in in this address http://ccb.jhu.edu/software/kraken/MANUAL.html . After downloading the database from ncbi, I do not think the standard database download described on the Kraken webpage works anymore.

When I run the standard database build command provided - kraken-build --standard --db MY_DB, I get following error.

Downloaded taxonomy tree data
Uncompressed GI to taxon map
Uncompressed taxonomy tree data
--2017-09-04 15:29:14--  ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.fna.tar.gz
           => ‘all.fna.tar.gz’
Resolving ftp.ncbi.nih.gov ftp.ncbi.nih.gov)... 130.14.250.12, 2607:f220:41e:250::12
Connecting to ftp.ncbi.nih.gov ftp.ncbi.nih.gov)|130.14.250.12|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /genomes/Bacteria ...
No such directory ‘genomes/Bacteria’.

As mentioned on the http://ccb.jhu.edu/software/kraken/MANUAL.html in the Standard Kraken Database sections:

WARNING: we created the scripts to build the standard database way back in 2014. It relies on the NCBI database to automatically download all the genomes and taxonomy IDs needed. However, NCBI's files have changed and we don't have the resources to keep this script up to date, so it might either (a) fail or (b) succeed but get a set of genomes that is incomplete. Therefore we STRONGLY recommend that you build a custom database with all the genomes you need for your application. You'll be glad you did!
ADD REPLY
0
Entering edit mode

And how can I do this? I mean how can I use your codes? Thank you

ADD REPLY
0
Entering edit mode

Please follow this post. Update kraken databases

ADD REPLY
0
Entering edit mode

I've read this before, as I told I copied and pasted your code in a file and named it "krak.sh", now I want to now what should I do exactly, step by step, can you help me please? Thank you

ADD REPLY

Login before adding your answer.

Traffic: 2233 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6