Question

Bacterial known pathway - the easiest way to find and download 1:1 orthologs to each ferment involved

0

Entering edit mode

6.8 years ago

natasha.sernova ★ 4.0k

Dear all, I am studying a well known bacterial pathway. I've checked a couple of enzymes here (https://omabrowser.org/oma/) - I've found more than a thousand orthologs, but I need just orthologs in gram(+) bacteria. It's possible to do it manually, but I expect to spend a week on it. How can I do it computationally? OMA provides a lot of file-formats for its output. But I am pretty ignorant in HTML-files and Python, unfortunately. Thank you very much for any help! Natasha

genome OMA • 1.5k views

ADD COMMENT • link updated 6.8 years ago by Adrian Altenhoff ★ 1.1k • written 6.8 years ago by natasha.sernova ★ 4.0k

1

Entering edit mode

A quick look at the data files seems to indicate that this would not be a straightforward thing. You may want to write to OMA folks to see if they have a way to custom query their database on the backend to generate the data you are looking for.

A query with "gram positive" brings up this. Perhaps you could use that to get the sequence.

ADD REPLY • link 6.8 years ago by GenoMax 147k

1

Entering edit mode

This is going to involve parsing the information from the available datasets. I suggest either parsing the "OMA groups" file in txt or xml or the "OMA Groups/Sequences in COGs format". However I could not find the important file listing the Gram positive bacteria. You may need to do this by using the NCBI taxonomy database.

ADD REPLY • link 6.8 years ago by Joseph Hughes ★ 3.0k

score 2 · Answer 1 · 2018-01-23

As @Joseph Hughes points out, you will have to either use the REST API to search for the orthologs of your query genes and limit them to the gram-positive genomes (as far as my understanding goes these are essentially the Actinobacteria), or you parse the flat files that contain all the orthologs and filter the ones from your clade of interest. Both approaches require some scripting in your favorite language.

To get the set of Actinobacteria in OMA from the REST API you can use the following get-url: https://omabrowser.org/api/taxonomy/Actinobacteria/

Then, you can limit the orthologs (either pairwise or HOGs) to species belonging to set of Actinobacteria.