Hi all!
I was testing assign_taxonomy.py and summarized_taxa.py scripts for my own reference from NCBI.
My reference was created as GreenGenes reference (fasta and taxonomy file), except that mine contains only species taxonomy (full 16S gene). Since I've made it compatible with qiime I observed that after summarized_taxa.py script all reads are somehow divided into all 7 levels....
Can you explain how summarized_taxa.py works? Is there some kind of thresholds (%?) for phylum, class, order, family, genus, species level that helps calculate number of assigned reads?
Any help will be much appreciated :)
Best, Agata
This may help: