Dear all, I am studying a well known bacterial pathway. I've checked a couple of enzymes here (https://omabrowser.org/oma/) - I've found more than a thousand orthologs, but I need just orthologs in gram(+) bacteria. It's possible to do it manually, but I expect to spend a week on it. How can I do it computationally? OMA provides a lot of file-formats for its output. But I am pretty ignorant in HTML-files and Python, unfortunately. Thank you very much for any help! Natasha
A quick look at the data files seems to indicate that this would not be a straightforward thing. You may want to write to OMA folks to see if they have a way to custom query their database on the backend to generate the data you are looking for.
A query with "gram positive" brings up this. Perhaps you could use that to get the sequence.
This is going to involve parsing the information from the available datasets. I suggest either parsing the "OMA groups" file in txt or xml or the "OMA Groups/Sequences in COGs format". However I could not find the important file listing the Gram positive bacteria. You may need to do this by using the NCBI taxonomy database.