I have tried three different methods to find single copy gene family. The first two are OrthoFinder & OrthoMCL. The third method I choose was InParanoid. I down load InParanoid 4.2 from https://bitbucket.org/sonnhammergroup/inparanoid4/src/master/. And successfully run the program.
200 sequences in file EC
200 sequences in file SC
120 sequences EC have homologs in dataset SC
79 sequences SC have homologs in dataset EC
221 EC-EC matches
135 SC-SC matches
###################################
25 groups of orthologs
62 in-paralogs from EC
47 in-paralogs from SC
Grey zone 0 bits
Score cutoff 40 bits
In-paralogs with confidence less than 0.05 not shown
Sequence overlap cutoff 0.5
Group merging cutoff 0.5
Scoring matrix BLOSUM62
###################################
___________________________________________________________________________________
Group of orthologs #1. Best score 590 bits
Score difference with first non-orthologous sequence - EC:590 SC:435
LEU2_ECOLI 100.00% LEU2_YEAST 100.00%
___________________________________________________________________________________
Group of orthologs #2. Best score 467 bits
Score difference with first non-orthologous sequence - EC:365 SC:396
ADH3_ECOLI 100.00% FADH_YEAST 100.00%
___________________________________________________________________________________
Group of orthologs #3. Best score 463 bits
Score difference with first non-orthologous sequence - EC:463 SC:463
6PGD_ECOLI 100.00% 6PG1_YEAST 100.00%
6PG9_ECOLI 90.28% 6PG2_YEAST 81.59%
___________________________________________________________________________________
Group of orthologs #4. Best score 446 bits
Score difference with first non-orthologous sequence - EC:446 SC:13
FTSH_ECOLI 100.00% YME1_YEAST 100.00%
___________________________________________________________________________________
Group of orthologs #5. Best score 411 bits
Score difference with first non-orthologous sequence - EC:145 SC:411
ADH2_ECOLI 100.00% ADH4_YEAST 100.00%
___________________________________________________________________________________
Group of orthologs #6. Best score 339 bits
Score difference with first non-orthologous sequence - EC:74 SC:65
LYSP_ECOLI 100.00% CAN1_YEAST 100.00%
ALP1_YEAST 52.69%
LYP1_YEAST 48.95%
DIP5_YEAST 5.39%
___________________________________________________________________________________
Now, I obtained orthologs information. But, I don't know how find the single copy gene family. I guess the first group of orthologs - LEU2_ECOLI & LEU2_YEAST
- was a single copy gene family. I don't know if what I think is right.
Could anybody please suggest, how to find single copy gene family from InParanoid, especially in analysis of many species.
Best Wishes !
I'd just use
OrthoFinder
. It outputs a separate directory of single copy orthologs if any such (ortho)groups exist in your data set.I also used OrthoFinder, but I‘m worried that the result of only using
Orthofinder
is inaccurate. So I want to find the consensus list between several tools (Orthofinder
,OrthoMCL
,TreeFam
,InParanoid
...).I don't think you need to pool the results of multiple methods here. If you're using the latest version of
OrthoFinder
you're getting the best results from among any of the tools you've mentioned. The only thing you'll need to worry about is making sure that you're providingOrthoFinder
the right species tree (if the trees it has estimated on its own appear to be incorrect).