Question

Best Approach to Identify Human-C. elegans Orthologs?

0

Entering edit mode

17 days ago

bioinformatics_rk • 0

I’m working with a list of approximately 3,000 human genes and want to identify their orthologs in C. elegans. While I understand that the distinction between homologs and orthologs isn't always straightforward, for my purposes, I'm focused on orthologs or closely related genes.

I aim to use this in a publication, so I want to rely on databases or resources backed by reputable organizations and widely cited in the field.

So far, I’ve explored EggNOG6 (http://eggnog6.embl.de/#/app/home). I downloaded the e6.og2seqs_and_species.tsv file (description here: https://github.com/eggnogdb/eggnog_docs/wiki/Description-of-download-files), which organizes species into orthologous groups (OGs). However, the relationships between human and C. elegans genes often appear as many-to-many, making it challenging to pinpoint precise matches.

Does anyone have recommendations for the best approach or alternative resources to achieve this? Any insights or advice would be greatly appreciated!

c.elegans Orthologs Homologs Genomics • 584 views

ADD COMMENT • link 16 days ago by bioinformatics_rk • 0

4

Entering edit mode

Regarding the issue of "one-to-many" relationships in the EggNOG results:

Humans and C. elegans share a common ancestor that lived roughly 700 MYA (at least according to timetree.org). Due to gene duplication events in both lineages since this time, it is basically inevitable that some of the genes shared by common ancestry will have complex relationships that don't follow simple one-to-one relationships.

So if you use one of the suggestions provided by others, don't be surprised if you continue to see a lot of one-to-many or many-to-many relationships in the results. This just stems from how evolution works, and is in some sense unavoidable.

Of course, you can focus your analysis on only the genes that have maintained a one-to-one relationship between the two species, but that could end up being a relatively small subset of your data.

ADD REPLY • link 16 days ago by Dave Carlson ★ 2.1k

2

Entering edit mode

I can only double this! As an alternative, you might want to compare the list with other tools. In the OMA Browser, we provide for example a genome-pair view: https://omabrowser.org/oma/genomePW/

ADD REPLY • link 16 days ago by Adrian Altenhoff ★ 1.1k

0

Entering edit mode

Thank you for your help!!

ADD REPLY • link 16 days ago by bioinformatics_rk • 0

1

Entering edit mode

Have you checked: http://www.greenwaldlab.org/ortholist/

ADD REPLY • link 17 days ago by GenoMax 148k

0

Entering edit mode

Thank you for your help!!

ADD REPLY • link 16 days ago by bioinformatics_rk • 0

1

Entering edit mode

You can construct your own database by doing a reciprocal best hit search. Use something like MMseqs2's rbh module with both sets of sequences as inputs.

You'll probably still end up with some one-to-many (e.g., due to in-paralogs) situations that you'll have to resolve.

distinction between homologs and orthologs

Orthologs are (one type of) homologs.

ADD REPLY • link 17 days ago by Dunois ★ 2.8k

score 2 · Answer 1 · 2024-12-05

For well-annotated and model organisms, I recommend you use the orthologues in Biomart. Try the following setup (URL shortened)

Then add the genes you want to fetch in the Filter field or use the generated Perl code or the biomaRt interface in R. enter image description here

I recommend this approach because you are unlikely to outperform the curated orthologue detection in extensively annotated genomes by a de novo approach using a locally executed pipeline.

score 2 · Answer 2 · 2024-12-05

2

Entering edit mode

16 days ago

harold.smith.tarheel ★ 5.0k

OrthoList 2 (reference) is the most comprehensive and up-to-date resource for worm/human orthology.