Hi everyone,
I am creating a custom database in order to analyse paired reads found in stool samples. My project only looks at the abundances of bacterial species of the Lactobacillaceae family, therefore I downloaded all respective .fna files from the NCBI databank.
When inspecting the .fna files I found many of them containing several genomes, often one complete genome of the bacterial strain and additionally one or two plasmid genomes of the same bacterial strain. For my analysis I am only interested in the complete genomes, however if I filter out "plasmid" I seem to lose lots of .fna files which also contain valuable bacterial genomes.
Does anybody have experience in how Kraken2 handles these plasmid genomes? Are they processed individually or is the end result one abundance score for the respective bacterial strain?
Thank you so much for your help!
BW, Rapha
Amazing, that's good news, thank you so much for your help!