I've obtained a set of genes specific of a bacterial strain (Group A) not present in another group of strains (Group B). How could I validate my results using alternative methods?
I've obtained a set of genes specific of a bacterial strain (Group A) not present in another group of strains (Group B). How could I validate my results using alternative methods?
I think people tend to interpret orthology results in black and white terms, which isn't very accurate. Just because orthoMCL produced clusters of genes specific to a clade doesn't actually mean that the gene family doesn't exist in other species you've included in the analysis, unless you are defining a gene by purely sequence information.
Without getting into a philosophical debate on the definition of a gene, it is possible that a gene underwent rapid evolution in the other species and the sequence changed just enough for orthomcl to not cluster it within the gene family; however the gene still retained the same function. If you change the clustering granularity setting in MCL (-I paramter), you might actually get different clustering results.
I would also suggest to use OrthoFinder (http://www.stevekellylab.com/software/orthofinder), which is very similar to OrthoMCL except that it does an extra normalization step for the length of the gene.
Use another program.
There are two nice programs, searching for orthologs, OMA and OrthoDB-soft.
http://omabrowser.org/standalone
All the referencences are inside.
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383991/
This paper is about the second program.
Good luck!
I'm currently mapping all raw reads from Group B against the set of specific genes of Group A using bwa mapper. Parsing the resulting sam file is being hard, since my idea is to get the coverage for the specific genes of Group A, so I can see if there are raw reads mapping to specific genes. I'm now trying to use BEDTools to do that but still don't have the result since I'm quite new with this suite. My ideal result is not having good coverage for these specific genes, and this would thus agree with my orthoMCL results.
Sure someone has something to say about this strategy.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Seems genes are not specific, I found raw data for them. Let's try BLAST