Entering edit mode
18 months ago
Francois Piumi
▴
70
Hi, I'd like to perform a kraken library on the nt library, since my fastq sequences are not human
I build a kraken nt library, as follows:
kraken2-build --download-taxonomy --db nt
Then the followng step, was to "kraken" my fastq file:
kraken2 --db nt --output ERR637906.output.txt --report ERR637906.report.txt ERR637906.fastq.gz
Here is the output error message:
kraken2: database ("./nt") does not contain necessary file taxo.k2d
Could you please explain me how to build a right kraken "nt" database?
Have you built the
kraken nt
database? What you show is just step 1 (building taxonomy) in custom library build (see: https://github.com/DerrickWood/kraken2/wiki/Manual#custom-databases ) Step 2 is where you download thent
db.Here are the commands I found to download the nt database:
wget http://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nt.gz
pv nt.gz | zcat | cat -v | sed 's/\^A/\t/g' > nt.fasta
kraken2-build --download-taxonomy --db nt
the "pv" command was not working, so I used:
zcat nt.gz | cat -v | sed 's/\^A/\t/g' > nt.fasta
there were also two other command lines that I didn't use:
kraken2-build --add-to-library ./nt.fasta --db nt
kraken2-build --build --threads 6 --db nt
Looks like you did not build the actual database. $DBNAME is whatever you want to call your database.
That said save yourself the trouble and download prebuilt
kraken nt
database indexes from here: https://benlangmead.github.io/aws-indexes/k2As you can see they are large ~480 GB and it would be a chore to build locally and need a lot of resources.
the first command is not working since I got the following error message:
gzip: nt: Disk quota exceeded
So it seems impossible for me to perform a kraken on nt.....
I am trying to add manually genomes in kraken but I also meet some difficulties
Add a genome to a kraken library