Validation of OrthoMCL results
3
0
Entering edit mode
8.7 years ago
biotech ▴ 570

I've obtained a set of genes specific of a bacterial strain (Group A) not present in another group of strains (Group B). How could I validate my results using alternative methods?

bacteria pangenome • 3.4k views
ADD COMMENT
1
Entering edit mode
8.7 years ago

I think people tend to interpret orthology results in black and white terms, which isn't very accurate. Just because orthoMCL produced clusters of genes specific to a clade doesn't actually mean that the gene family doesn't exist in other species you've included in the analysis, unless you are defining a gene by purely sequence information.

Without getting into a philosophical debate on the definition of a gene, it is possible that a gene underwent rapid evolution in the other species and the sequence changed just enough for orthomcl to not cluster it within the gene family; however the gene still retained the same function. If you change the clustering granularity setting in MCL (-I paramter), you might actually get different clustering results.

I would also suggest to use OrthoFinder (http://www.stevekellylab.com/software/orthofinder), which is very similar to OrthoMCL except that it does an extra normalization step for the length of the gene.

ADD COMMENT
0
Entering edit mode
8.7 years ago
natasha.sernova ★ 4.0k

Use another program.

There are two nice programs, searching for orthologs, OMA and OrthoDB-soft.

http://omabrowser.org/standalone

All the referencences are inside.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383991/

This paper is about the second program.

Good luck!

ADD COMMENT
0
Entering edit mode
8.7 years ago
biotech ▴ 570

I'm currently mapping all raw reads from Group B against the set of specific genes of Group A using bwa mapper. Parsing the resulting sam file is being hard, since my idea is to get the coverage for the specific genes of Group A, so I can see if there are raw reads mapping to specific genes. I'm now trying to use BEDTools to do that but still don't have the result since I'm quite new with this suite. My ideal result is not having good coverage for these specific genes, and this would thus agree with my orthoMCL results.

Sure someone has something to say about this strategy.

ADD COMMENT
0
Entering edit mode

Seems genes are not specific, I found raw data for them. Let's try BLAST

ADD REPLY

Login before adding your answer.

Traffic: 1921 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6