Hello,
I obtained incongruent "classification" results when running Kraken2 vs. Minimap2 and Blast on the same data sets.
I am planning to assemble 5 bacterial genomes from pacbio reads. For 2 of the bacteria I generated two raw assemblies by using Canu. I suspected that there may be some contamination in the reads, so for each bacterium, I evaluated the raw pacbio reads using minimap2. Additionally, for the 2 draft assemblies I also used NCBI Blast to compare them to whole bacterial genomes as well as to the bacterial 16s rRNA databases. I was advised to use Kraken for a more comprehensive evaluation of possible contaminants.
Comparing the results from Minimap2 and Blast vs. Kraken2 I noticed some incongruence between the matches they report.
For example, Minimap2 indicate that 18.72% of the raw sequences of my Sphingomonas sp. map to Ignatzschineria larvae; 55% map to Sphingomonas paucimobilis. But, Kraken2 does not match Sphingomonas sp. to Ignatzschineria larvae at all. However, Kraken2 does match Sphingomonas sp. to a Sphingomonas (Sphingomonas paucimobilis is not included in the library of ref seqs that kraken builds).
Minimap2 indicates that 78.49% of the raw sequences of my Kurthia sp. map to Moellerella wisconsensis. But, kraken2 does not match my Kurthia sp. to Moellerella wisconsensis at all.
For the raw sequences of my Ignatzschineria sp., Minimap2 indicates that 81% of the raw sequences map to Moellerella wisconsensis. About 20% of the raw seqs map to Providencia. Blast of the draft assembly of my Ignatzschineria sp. to whole genomes and to 16s indicate that my Ignatzschineria may, in fact, be Moellerella wisconsensis. However, kraken2 does not match my Ignatzschineria sp. to Moellerella wisconsensis at all.
I run kraken2 in an IBM cluster.
Can someone give me a hint on why I observe the incongruence between the results reported by these methods?
Best regards,
h.mom and colindaven,
Thank you for your answer. See below.