Question

Clusters of orthologous research tools

0

Entering edit mode

4 months ago

anna • 0

Hey, I am annotating 10 bacterial genomes belonging to the same species and I am looking for a way to analyze the clusters of orthologous genes. Can you tell me which path to follow to identify them and perhaps gather them in a table? Is there a tool that allows you to compare the protein sequences of the various genomes and identify the shared COGs?

Thanks for any help,

A.

genome annotation • 359 views

ADD COMMENT • link updated 4 months ago by dthorbur ★ 2.5k • written 4 months ago by anna • 0

score 1 · Answer 1 · 2024-07-31

Hi Anna, there are many options for this kind of gene-based pangenome analysis.

I will start with our own https://github.com/eead-csic-compbio/get_homologues, which you can use following the bacteria tutorial and manual . Compared to other options out there I guess it has two advantages:

It supports different clustering algorithms (BDBH, OMCL, COGS) and you can get consensus core clusters.
You can pipe resulting clusters to https://github.com/vinuesa/get_phylomarkers to compute robust phyologenies out of sequence clusters.

GET_HOMOLOGUES can be installed with conda and can use a HPC cluster to parallelize tasks. A Docker container shipping with both GET_HOMOLOGUES and GET_PHYLOMARKERS is available at https://hub.docker.com/r/csicunam/get_homologues .

Another obvious software choice would be https://sanger-pathogens.github.io/Roary for instance.

score 1 · Answer 2 · 2024-07-31

1

Entering edit mode

4 months ago

dthorbur ★ 2.5k

I'm a big fan of OrthoFinder. Whilst it's designed for interspecific comparisons, I have used it for different assemblies of the same species before with success.

It even provides gene trees for each orthogroup, and an overall species tree based on all single-copy orthologs trees.

ADD COMMENT • link 4 months ago by dthorbur ★ 2.5k