This is mostly a tool I made for personal use, but I thought I would share it. MetaMarkers is a script that quickly (<5 minutes) assesses and visualizes the taxonomic assignment / tetranucleotide frequencies of large contigs (> 5 kbp) in a metagenome assembly. It translates and searches nucleotides for 11 ribosomal proteins and blasts them against known reference marker proteins. The output are the protein sequences of the identified markers, their closest BLAST hit, and an annotated PCA of tetranucleotide frequencies for contigs with markers.
It's not supposed to replace real taxonomic profilers (MetaPhlan, PhyloSift, Kraken, etc.) which are much more comprehensive, but can also be a lot more resource and time intensive, and can be cumbersome when your immediate goal is to very quickly and reliably understand the microbial breakdown of a large metagenome. How complex is the community? Who are the few major players, and what are the large contigs? These are the basic questions I want to be able to answer extremely quickly.
The only requirements are a few python libraries, BLAST+, and HMMER. Input is a FASTA of assembled contigs. Let me know if you find it useful.