I ran KRAKEN2 using Viral database and a TranscriptAssembly assembled by rnaSpades.
result is O.K. I found some targeted viruses.
Now I want those sequence which KRAKEN2 identified under those taxonimic group.
my KRAKEN2 output looks like this
C NODE_24_length_27434_cov_34671.005410_g0_i23 1147722 27434 0:14927 196894:5 1147722:1 0:12164 28883:2 2545435:2 0:10 754059:4 0:285
I Understand \
- 1st column indicating Classified or Unclassified \ 2 then the Header of fasta \ 3 The Taxonomic ID \ 4 Length of that Fasta \
- k-mers match \
Now I used the taxon ID information to find out the fasta headers and then I fetch out those Sequences from Assembly. However, If I use these sequences to do nBLAST it is not showing any similar result.
further, I understand that Kraken not used full sequence to identify that reported organism(k-mers). Then how to I get those identified sequences?