KRAKEN2 database build with NCBI nucleotide

0

Entering edit mode

3.8 years ago

jimmy0958073736 ▴ 40

Hi everyone I try to download NCBI nucleotide file then translate into KRAKEN2 file. But I meet a problem, having any tool like SRA-toolkit can put into SRA number to auto download. Because KRAKEN2 database need Taxonomy ID and fasta format to build . Could anyone knows how to download one of bio-project database in nucleotide. Then translated into KRAKEN format

KRAKEN2 NCBI nucleotide SRA-toolkit • 6.0k views

ADD COMMENT • link updated 3.4 years ago by predeus ★ 2.1k • written 3.8 years ago by jimmy0958073736 ▴ 40

0

Entering edit mode

Please see: How do you download the nt database for Kraken2?

ADD REPLY • link 3.8 years ago by GenoMax 147k

0

Entering edit mode

Such a good suggestion. But my situation is that I want to transfer Bio-project data into Kraken2 build. For my latest way process data, just download tiny fasta xml to transfer.

ADD REPLY • link 3.8 years ago by jimmy0958073736 ▴ 40

0

Entering edit mode

You can directly download pre-built Kraken2 databases from their FTP website ftp://ftp.ccb.jhu.edu/pub/data/kraken2_dbs/ . For custom databases follow this tutorial https://ccb.jhu.edu/software/kraken/MANUAL.html#custom-databases .

ADD REPLY • link 3.8 years ago by Arup Ghosh 3.2k

0

Entering edit mode

THX Arup. My problem is not from kraken2 build, but from download bio-project on NCBI(then transfer to kraken2 build format). Sorry for I did not show my problem so clearly.

ADD REPLY • link 3.8 years ago by jimmy0958073736 ▴ 40

0

Entering edit mode

Just a heads-up - you probably don't want to use the results obtained by Kraken2 with nt. There's real chaos in taxonomy assignments and the results are rarely trustworthy. It's much better to use databases derived from RefSeq; there's been some variation in how these are prepared, but this is a good and up-to-date link:

https://benlangmead.github.io/aws-indexes/k2

ADD REPLY • link 3.4 years ago by predeus ★ 2.1k

Login before adding your answer.