I want to do an enrichment analysis across all bacterial phyla/classes. For this I first need to know, of all the genomes I have in my database, how many of them are gammaproteobacteria, spirochetes etc. A list which gives the names of these bacteria, their type and their accession number. Is there a repository which can give me this information?
Thanks Lina! I wasn't aware of the KEGG tree. Beautifulsoup solves the problem!
And never used the eUtils API, but the biopython link seems useful. Will definitely try that out. Thanks a ton! :)