We are doing a fungal metagenomics analysis with MEGAN5 and many of the trees branches end in very generic nodes. For testing the software we have run a test with the next sequence of Candida albicans extracted from GenBank:
> Candida_albicans gene=18S_rRNA_part+ITS1+5,8S_rRNA+ITS2+28S_rRNA_part length=536
tccgtaggtg aacctgcgga aggatcatta ctgatttgct taattgcacc acatgtgttt
ttctttgaaa caaacttgct ttggcggtgg gcccagcctg ccgccagagg tctaaactta
caaccaattt tttatcaact tgtcacacca gattattact aatagtcaaa actttcaaca
acggatctct tggttctcgc atcgatgaag aacgcagcga aatgcgatac gtaatatgaa
ttgcagatat tcgtgaatca tcgaatcttt gaacgcacat tgcgccctct ggtattccgg
agggcatgcc tgtttgagcg tcgtttctcc ctcaaaccgc tgggtttggt gttgagcaat
acgacttggg tttgcttgaa agacggtagt ggtaaggcgg gatcgctttg acaatggctt
aggtctaacc aaaaacattg cttgcggcgg taacgtccac cacgtatatc ttcaaacttt
gacctcaaat caggtaggac tacccgctga acttaagcat atcaataagc ggagga
The next image shows the results of the analysis and its parameters:
Parameters were optimized to reach species level as much as possible. BLAST results classifies the sequence as Candida albicans in most of the hits, except some of them classified as Candida sp.
However, the tree stops at a higher taxon (Saccharomycetes). Is that what I should expect?
Is there any other parameter to maximize the classification of the reads? Should I expect to reach species level with another program using this data?
I just tried to reproduce this and I get the same result, but one of my blast hits further down actually is to "Saccharomycetes sp.". Are you sure you don't have that hit in your result?