Best method for orthology prediction of more than two protein datasets
1
0
Entering edit mode
6.5 years ago

Hello all,

Looking for any suggestions on the currently accepted methodology for isolating orthologous proteins from multiple datasets. We are working with eukaryotes who are non-model organisms. Our datasets are in proteins assembled using transdecoder and we have done our best to eliminate redundant sequences. I am somewhat familiar with Hamstr, Orthofinder, OrthoDB, etc. but am not super confident as to which method would be best. Our goal is to rule out paralogous genes and construct a phylogenetic tree. We then want to explore certain genes of interest that are shared between the different species. Any links to good reviews would also be appreciated.

Best,

A.B.

genome rna-seq • 1.2k views
ADD COMMENT
0
Entering edit mode

Hi,

Can you describe what you done to eliminate redundant sequences? how did you obtain your proteins, from genome or transcriptome? If you have proteins from genome and transcriptome derived, I can suggest that you can first get orthologs of genome-derived protein data set, and later you can use those orthologs to find in transcriptome-derived proteins. If you use both genome and transcriptome-derived proteins together in orthologs analysis, you may not get enough number (>50) orthologs proteins (if you have more than 10 species data).

In addition to tools you mentioned you can use OMA tool, but OMA requires much storage area and takes longer than other tools.

ADD REPLY
0
Entering edit mode
6.5 years ago

Check out the TreeFam papers. The project isn't active anymore but the pipeline is now part of Ensembl Compara and the code still available. You could either run the pipeline again with your sequences or use the Ensembl compara HMMs to identify families for your proteins and add them to the corresponding trees.
By definition, you can only identify paralogues if you build a phylogenetic tree.

ADD COMMENT

Login before adding your answer.

Traffic: 1400 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6