I extracted genes of interest from genomes of hundreds of species. Then, I joined the extracted protein sequences of each species in a single FASTA file, and I used omamer to classify them into HOGs (Hierarchical Orthologous Groups). Next, I would like to study the evolutionary dynamics of these gene families. This seems to be possible with pyHAM. However, I do not currently have neither the species tree nor the orthoXML file required to do such analysis. For the species tree, I should be able to download a time-calibrated phylogeny from the internet and prune it with ape's R package, so I believe it won't be a big issue.
The problem is the orthoXML file: I noticed that OMA's standalone can be used to obtain both the orthoXML file and the species tree. However, it seems that this program uses a different procedure (alignment-based) to assign HOGs which is much slower and maybe less reliable than omamer (alignment-free) according to its publication.
Therefore, is there an automatic way to generate an orthoXML file using the output of omamer? If not, what would be the best way to study the evolutionary dynamics of my gene families of interest (taking into account the large size of my dataset)?
Thanks in advance
Could you please describe you goal in more details. It would be great if you could clarify the term "evolutionary dynamics", (maybe I'm not knowledgable enough).
The intention would be to use HOGs produced by OMAmer to examine the joint evolutionary events acting on the set of species under consideration (i.e. evolutionary dynamics), similar to what OMA standalone does (see "Dynamics of Genome Evolution" at https://omabrowser.org/standalone/#applications)