I'm trying to figure out the best way to do this. I have the newest taxdump.tar.gz
and prot.accession2taxid.gz
files from NCBI.
Is there a way to use TaxonKit to get all of the species-level identifiers from bacteria and then use this to pull out the proteins from nr?
I am reasonably certain this was asked recently. Have you searched Biostars via google?