Question

Kraken_results or Kraken_report? Which file to consider ?

0

Entering edit mode

5.2 years ago

data_diva • 0

Hi all,

I ran kraken on my samples and generated results for the same. However, I am quite confused of which file, either kraken.result or kraken.report should be considered as final output.

Here is a sample report of my data (db used was minikrakendb):

78.58   632256  632256  U   0   unclassified

19.45   156475  3197    D   2       Bacteria

6.38  51349 586 D1  1783272    Terrabacteria group

5.73    46078   804 P   1239            Firmicutes

It seems that, 78% of my reads are unclassified.

Also, as per the report, does it mean that, in row #3, among all the reads in my raw fastq (input), only 51349 reads belonged to terrabacteria group? Or, does kraken performs a filtering step before classification, so that, if I had 100000 reads in my raw input fastq file, and kraken filtered them to (for instance) 80000, and among those 51,349 only mapped to terrabacteria?

One more final query is that, I have a list of TaxIDs and species names from the kraken report. Are there any R packages which can retrieve pathways of those taxIDs or species names?

Thank you!

kraken taxonomy pathways • 2.0k views

ADD COMMENT • link updated 5.1 years ago by onestop_data ▴ 330 • written 5.2 years ago by data_diva • 0

score 0 · Answer 1 · 2020-01-23

Please read the Kraken2 Manual. There is a section explaining in detail both outputs - look for STANDARD KRAKEN OUTPUT FORMAT and SAMPLE REPORT OUTPUT FORMAT.

In relation to the other question, I would not advise you getting the pathways for the tax_ids and assume those pathways are present in your community. Instead, I would recommend you use a tool to annotate the functions/pathways/subsystems. There is a tool that uses the SEED (SUPER-FOCUS) database and annotates the input sequence or you could use MEGAN which does taxa, but can also generate the functional analysis for you.

What kind of sample do you have? A virome? 78.58% unclassified :0.