I have got assembled metagenomic contigs, with multiple annotations per contig. The species name assigned to each annotation varies within contigs. It is suspicious, I know, but they are generally species of the same family. Considering this is due to the known difficulty of assigning sequences to the species level:
Is it possible to get the full taxonomy (genus, family, order, class, phylum) for a list of species names? That would allow me to cluster annotations at a higher taxonomic rank.
Example: Bacteroides thetaiotaomicron belongs to (Phylum)Bacteroidetes;(Class)Bacteroidetes;(Order)Bacteroidales;(Family)Bacteroidaceae
Any additional comment or question is welcome!
If you want to focus on the main taxonomic ranks, i.e. superkingdom, kingdom, phylum, class, order, family, genus & species, you can do the following :
Which yields :
Ps : All the doc for xtract is here : https://dataguide.nlm.nih.gov/edirect/xtract.html