KRAKEN2 database build ref-seq have some missing data

2

Entering edit mode

3.5 years ago

jimmy0958073736 ▴ 40

Hi, everyone I have a small question on the build-up of a database in Kraken2 ref-seq. When I ran out all of the ref-seq databases in kraken2 like that kraken2-build --download-library bacteria(including fungi and virus) --db $DBNAME. But unfortunately, I can't find some of the bacteria which I'm interested in (e.g. PJP Pneumocystis pneumonia). Have anyone meet the same problem ?

KRAKEN2 ref-seq • 1.9k views

ADD COMMENT • link 3.4 years ago by jimmy0958073736 ▴ 40

0

Entering edit mode

I had bad experience with some automatic classification tools like kraken/kaiju/etc, even with mock data from known organisms. My humble suggestion: don't use automatic classification tools.

ADD REPLY • link 3.5 years ago by Arsenal ▴ 160

0

Entering edit mode

Unfortunately, I had already process the kraken2 ref-seq database by myself( not automatic) but seems like the database too huge to create ( over than 1T), my RAM can not afford it.

ADD REPLY • link 3.4 years ago by jimmy0958073736 ▴ 40

0

Entering edit mode

Holy moly. Talk about huge. Could you split this db maybe? And how exactly did you process it? You got my curiosity.

ADD REPLY • link 3.4 years ago by Arsenal ▴ 160

0

Entering edit mode

I try to split db(.fa) into several parts, but the last part of the process needs to merge and transfer with bracken db. It seems to have to cross this challenge. In my situation, I use the following format of db (>sequence16|kraken:taxid|32630 , fast), as I said it too huge( over 100 thousand microbial need to build).

ADD REPLY • link 3.4 years ago by jimmy0958073736 ▴ 40

Login before adding your answer.