Dear all,
I have 3 species A, B, C and tried to use orthofinder to find the genes existed in A B species but absent in C species. Then I got the results from orthofinder, there is about 19691 OG number (gene families) existed in A, B species but absent in C species, I think it is too much? but not sure what's wrong with it ..
Then I tried to only use A, C species for orthofinder, and find genes existed in A species but absent in C, the results only have 23 OG number, ( if I tried B, C species, the results is also only 32 OG number), much less than the results when we put A, B, C together.
So I am quite confused why this happen and not sure is the results is realiable..any suggestions will be pretty appreciated!
Thanks!!
that there are gonna be differences I can certainly explain, but this difference is quite huge. Are you sure you did not made a technical error somewhere in the processing/running ?
How exactly did you count the absent genes?
Hi, thanks for your reply! there is no error report, here is the script I used
But for interpreting the results, I maybe made a mistake. I used the file "Orthogroups.GeneCount.csv", and "Orthogroups.csv" to interpret the results, for example, for file "Orthogroups.GeneCount.csv", I extracted the OG number 0 in C species but > 0 in A, B sepcies, and think these genes are the genes existed in A, B species but absent in C.
Then I find a paper said "The orthologs and orthogroups between bottlenose catfish and channel catfish were generated using OrthoFinder (42). To obtain the species-specific genes, Protein BLAST (BLASTP) was performed in which genes not included in the orthogroups were queried against the genes in the orthogroups within the same species, with a maximal E-value of 1e-10. A reciprocal BLASTP with a maximal E-value of 1e-5 was used to query genes with no hits from previous steps (9). The genes with no hits to any orthologs were considered as species-specific genes."
So, maybe I am wrong to interpret that?