I have posted this in Centrifuge github, but it doesn't to be seem very responsive.
I want to analyze microbial abundances on genus level using centrifuge
. However I see that genus level abundances (and read counts) are less than abundances of species in the corresponding genus, which seems a bit counter intuitive.
For example, if I do grep Homo report.tsv
, I get
Homo 9605 genus 0 12 0 0.0
Homo sapiens 9606 species 3238442024 738484 395792 1.26543e-06
Shouln't the software summarize abundances in hierarchical manner? Please let me know if I miss something.
Cheers,
Thanks for clarifying! This is actually exactly how I was interpreting the assignments until I have seen a Kraken file format, where everything is summarize in hierarchical manner, and that's why I thought that most probably it should be the case here as well.
you would sum the counts at the genus itself (node 9605) with all of the counts on nodes that are descendants of Homo - is there a way to do it in centrifuge (or other tool), or I have to do it manually? Thanks
According to the centrifuge documentation, you can create a kraken-style report from the centrifuge report.