Question

Finding orthologs in a organism with a genome duplication.

1

Entering edit mode

5.7 years ago

bio.erikson ▴ 20

I'm trying to map human orthologs to the allotetraploid Xenopus Laevis. I've tried RBH Blast, and looked at a number of other Ortholog finding softwares; but, they all seem to work on finding a one-to-one relationships. This, is problematic for a Laevis, which has 2 copies of almost every gene. Can anyone recommend a workflow that is capable of handling many-to-one relationships, or have any bright ideas?

I've also tried Xenbase's manually curated human orthology, but I need ensembl ids. And converting from xenbase to entrez to ensembl is very messy.

orthologs genome duplication allotetraploid • 2.0k views

ADD COMMENT • link 5.7 years ago by bio.erikson ▴ 20

0

Entering edit mode

It depends on your workflow. If you are using FASTa sequences as a starting point, you only need to filter out near-identical sequences which will hopefully get rid of all duplicated proteins. This can be done using CD-HIT:

cd-hit -i input.fas -o input.95 -c 0.95

When two or more sequences share >=95% identity, this program will remove everything but the longest sequence in that cluster. After this step you do the orthology finding as usual.

ADD REPLY • link 5.7 years ago by Mensur Dlakic ★ 29k

0

Entering edit mode

Two problems, The WGD is ancient, many of the duplicated genes have low sequence identity, > 70%. But still have functionally identical roles. And I need to know the human ortholog for both gene copies.

ADD REPLY • link 5.7 years ago by bio.erikson ▴ 20

0

Entering edit mode

CD-HIT can cluster at 70% identity, and even down to 40%.

When you find a human ortholog for one of the two protein copies, presumably you have found it for the other copy as well. It is a simple functional transfer. CD-HIT creates .clustr files which tell you what proteins were grouped together.

ADD REPLY • link 5.7 years ago by Mensur Dlakic ★ 29k

0

Entering edit mode

I am developing a software fot finding local alignments. Could you please tell me one (or more) of the sequences from the Xenoplus Laevis? I'd like to check if the results could be helpful for you. I have had success aligning some highly diverged species. Then maybe I could think a worlflow...

ADD REPLY • link 5.7 years ago by juanjo75es ▴ 130

0

Entering edit mode

You can use OrthoFinder, which will give you one-to-one, one-to-many, many-to-one, and many-to-many.

ADD REPLY • link 5.7 years ago by Mehmet ▴ 820

Ram · Answer 1 · 2019-09-11

3

Entering edit mode

5.7 years ago

Christophe Dessimoz ▴ 740

You can use OMA for this. If your genomes are in OMA, you can use the pairwise orthology function (https://omabrowser.org/oma/genomePW/).

In your case, since Xenopus Laevis is not yet in OMA, you could use OMA standalone, which will produce pairwise ortholog files which include also 1:many and many:many relationships.

ADD COMMENT • link updated 5.7 years ago by Ram 45k • written 5.7 years ago by Christophe Dessimoz ▴ 740