Hi,
I have a group of proteins that I want to analyse if they are related, do you know if OMA can perform on only protein fasta as input? I am getting an error that I have provided 0 genomes so I am guessing it needs genomes as input but maybe there is an option to change the input
many thanks!
Hi, the aim of OMA is in principle to identify orthologs and paralogs among complete genomes. As a prerequisite OMA computes also Smith-Waterman alignments to identify the homologs in the first place. So in principle you can use OMA to identify the homologous protein pairs.
As for the later steps it is necessary to know to which species each protein belongs in order to distinguish between orthologs and paralogs, OMA requires one fasta file per species. You can simply split up your fasta file with all proteins into several ones with the sequences for each species. For the homology detection, it doesn't matter that your genomes are (potentially highly) incomplete (it does for the orthology detection though).
Having said this, if you're only interested in homology identification, OMA is a bit of an overkill and extracting the homology information is not straight forward. You could instead also directly use a tool like blast or a Smith-Waterman implementation. The one used in OMA standalone is also available as a python package named pyopa (pip install pyopa).