Best Approach to Identify Human-C. elegans Orthologs?
2
0
Entering edit mode
17 days ago

I’m working with a list of approximately 3,000 human genes and want to identify their orthologs in C. elegans. While I understand that the distinction between homologs and orthologs isn't always straightforward, for my purposes, I'm focused on orthologs or closely related genes.

I aim to use this in a publication, so I want to rely on databases or resources backed by reputable organizations and widely cited in the field.

So far, I’ve explored EggNOG6 (http://eggnog6.embl.de/#/app/home). I downloaded the e6.og2seqs_and_species.tsv file (description here: https://github.com/eggnogdb/eggnog_docs/wiki/Description-of-download-files), which organizes species into orthologous groups (OGs). However, the relationships between human and C. elegans genes often appear as many-to-many, making it challenging to pinpoint precise matches.

Does anyone have recommendations for the best approach or alternative resources to achieve this? Any insights or advice would be greatly appreciated!

c.elegans Orthologs Homologs Genomics • 585 views
ADD COMMENT
4
Entering edit mode

Regarding the issue of "one-to-many" relationships in the EggNOG results:

Humans and C. elegans share a common ancestor that lived roughly 700 MYA (at least according to timetree.org). Due to gene duplication events in both lineages since this time, it is basically inevitable that some of the genes shared by common ancestry will have complex relationships that don't follow simple one-to-one relationships.

So if you use one of the suggestions provided by others, don't be surprised if you continue to see a lot of one-to-many or many-to-many relationships in the results. This just stems from how evolution works, and is in some sense unavoidable.

Of course, you can focus your analysis on only the genes that have maintained a one-to-one relationship between the two species, but that could end up being a relatively small subset of your data.

ADD REPLY
2
Entering edit mode

I can only double this! As an alternative, you might want to compare the list with other tools. In the OMA Browser, we provide for example a genome-pair view: https://omabrowser.org/oma/genomePW/

ADD REPLY
0
Entering edit mode

Thank you for your help!!

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you for your help!!

ADD REPLY
1
Entering edit mode

You can construct your own database by doing a reciprocal best hit search. Use something like MMseqs2's rbh module with both sets of sequences as inputs.

You'll probably still end up with some one-to-many (e.g., due to in-paralogs) situations that you'll have to resolve.

distinction between homologs and orthologs

Orthologs are (one type of) homologs.

ADD REPLY
2
Entering edit mode
17 days ago
Michael 55k

For well-annotated and model organisms, I recommend you use the orthologues in Biomart. Try the following setup (URL shortened)

Then add the genes you want to fetch in the Filter field or use the generated Perl code or the biomaRt interface in R. enter image description here

I recommend this approach because you are unlikely to outperform the curated orthologue detection in extensively annotated genomes by a de novo approach using a locally executed pipeline.

ADD COMMENT
2
Entering edit mode
17 days ago

OrthoList 2 (reference) is the most comprehensive and up-to-date resource for worm/human orthology.

ADD COMMENT
0
Entering edit mode

Thank you for your help!!

ADD REPLY

Login before adding your answer.

Traffic: 1675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6