Hi all, I'm really struggling so if someone could help I would appreciate it.
My aim is: I have a list of 20 mammalian species. All of my species are in the inparanoid database. I want to identify a set of one-to-one orthologs between all 20 species (i.e. the same gene must be in all of the species). Basically, I don't even know where to start. When I look at the options available on the in paranoid home page, none of them match what I want to do specifically.
I tried to input a list of genes into the gene search; and I selected 4 species as a test, and I received a software error. Then I tried to select two species and pull out the orthologs between them, but this doesn't help between because I need all the one-to-one orthologs between all the species.
If anyone could tell me specifically how I could pull out one-to-one orthologs for a set of 20 genomes I would really appreciate it. It actually doesn't necessarily have to involve using InParanoid, i thought about doing reciprocal best hit blast, but I think in paranoid can deal with in-paralogs better.
In Ensembl, we have got 1:1 orthologues for mammalian genomes identified with our homology pipeline (Protein trees and orthologies). The data is available for download from the FTP, Perl API and REST API.
Thank you, Ensembl is what I would usually use. Unfortunately, not all of my species that I want are in Ensembl; so I'm looking for an alternative. I also thought about reciprocal best hit blast, but I don't think that method will pick up in-paralogs well. Thank you.
Dear Tom,
I don't know if you have figured your question out. Because I also run into this question. I decided to use treefam which is a method based on gene family tree. However if I look for the gene which have only single copy for each specie, the list would be ridiculous short (<150, 12 species).
I'm wondering if I could learn your experience, I would be very appreciate.
I used OrthoFinder: http://www.stevekellylab.com/software/orthofinder and I tested different parameter combinations in this tool to see what impact this had on analysis.
Are you referring to this site? It was last updated in 2007. Are you sure you want to rely on that information?
You may have better luck with the alignments provided by UCSC.
No, this site; the paper came out in 2015 (paper called "InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic"). Many thanks.
It seems some of their data for download at least may be old-ish (2014). As for the error you've encountered, it may be worth contacting their support team.
Looks like the downloadable data from new site is only for pair-wise comparisons. You could email their support and ask them about a possibility of doing a custom query for what you need run on the back-end database.
If you do email their support ask them to take down that older website. That is what came up first in a google search.