Question

The Role Of Non-Conserved Regions In Homologous Genes

2

Entering edit mode

12.5 years ago

Justin ▴ 470

Let's consider a hypothetical situation where I have a gene called G in humans that does function F and its homolog in mouse is gene H. On the whole, they are not conserved (represented by dashes) but there's a small portion that is conserved (represented by nucleotides) and we find that the conserved region is a binding site for a protein.

Gene G (human):  ------------------------ACCCGATCGATCGCAGT----------------
Gene H (mouse):  ------------------------ACCCGATCGATCGCAGT----------------

With that in mind, I have two questions:

1) Starting from G, how do you find that H is a homolog in mouse?
- For example, how is it done at the HomoloGene database or at Ensembl when they present you homologs of a gene? Is that done experimentally where they see the genes perform similar function in different species?

2) What could be the role of the nonconserved regions?
- Could the non-conserved sequences look utterly different yet share some 2D structure, or do you still need at least some sequence conservation? What kind of programs would be best suited to look at that?
- Could the non-conserved region be different because you need different sequences in order to implement the same function in different species?

• 5.7k views

ADD COMMENT • link updated 12.5 years ago by VS ▴ 740 • written 12.5 years ago by Justin ▴ 470

2

Entering edit mode

Homologous structures are similar because of common descent not because of functional similarity (analogous), if they are similar at all. Sequence similarity is an indicator for common descent, not proof. Functional similarity is no proof for common descent.

ADD REPLY • link 12.5 years ago by Ido Tamir 5.2k

0

Entering edit mode

Thanks, got confused with homologous and analogous.

ADD REPLY • link 12.5 years ago by Justin ▴ 470

score 9 · Answer 1 · 2012-10-10

For your first question, here is the link as to how Ensembl predicts homologs. In summary, the steps involved are --

Load longest translation of each gene from both species
Run Blast against each other
Generate clusters based on Blast scores
Make multiple alignments of seqs within clusters
From each cluster, build phylogenetic tree and infer orthology relationships

Your second question is very broad but as for the points-- Yes it is possible to have similar protein structure even when sequence is not wholly conserved and also to have different protein structure for similar sequences! I am not sure about 2D level.

The roles of non-conserved regions in orthologs can be myriad. There can be several orthologs in a species relative to one in another species( see one2many, many2one and many2many relationships in Ensembl link above). These extra orthologs arise by duplication and with evolutionary time they will diverge (rise of non-conserved regions) to adopt a new function (neo-functionalization), become dysfunctional (pseudogene) or sometimes each ortholog retains a part of original function or adopts a tissue specific expression (sub-functionalization).