Orthomcl or CD-HIT for Protein Clustering.
1
0
Entering edit mode
10.6 years ago
Naren ▴ 1000

What are basic differences in both the above tools while clustering large datasets of bacterial proteins.

In order to study Pan-genome structure and more specifically unique bacterial genes, which one will be better.

(apart from orthomcl is tedious to install, requires mysql etc.)

[any other tool suggestions are welcome]

cd-hit orthomcl clustering • 3.6k views
ADD COMMENT
0
Entering edit mode

If you want to identify orthologues, in this case reciprocal BLASTPs (or something lighter, e.g. USEARCH) could work too.

ADD REPLY
1
Entering edit mode
10.6 years ago
cts ★ 1.7k

cd-hit clusters proteins specifically based on the sequence identity. Orthomcl uses a graph-based clustering method to find orthologs without a specific sequence identity cutoffs. Neither one is better, but each is used for a different purpose. cd-hit is used for removing redundancy in a sequence set - like duplicated sequences. Orthomcl is used to find orthologs between species

ADD COMMENT

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6