Why Is Statistical Coupling Analysis Not Confounded By Shared Evolutionary History?
2
5
Entering edit mode
14.2 years ago
Rohan ▴ 50

Statistical Coupling Analysis (SCA) is a method for finding pairs of amino acids that covary in a protein family, based on a multiple sequence alignment. See http://www.hhmi.swmed.edu/Labs/rr/sca.html and http://en.wikipedia.org/wiki/Statistical_coupling_analysis.

What I don't understand, is how this method doesn't simply pull out the phylogenetic tree. Why doesn't shared evolutionary history bias this procedure?

statistics protein evolution alignment • 5.2k views
ADD COMMENT
2
Entering edit mode
14.2 years ago
Mrawlins ▴ 430

It seems to me that covariance is a originally determined by the thermodynamics governing protein/peptide folding and those covariances are propagated through evolutionary descent. The questions being asked of SCA and evolutionary history have some overlap.

SCA may pick up on features missed by evolutionary history. For example, imaging that a pair of mutations in a hemoglobin protein confer resistance to a mammalian parasite. That pair of mutations may have occurred in a common ancestor of all mammals, in which case SCA will produce the same result as the phylogenetic tree. But it may also have arisen later, in multiple species, and propagated from there. Phylogeny may not identify covariance in this sort of feature while SCA might.

Shared evolutionary history will probably bias the answers of SCA, but that bias doesn't necessarily invalidate the results. (Sometimes bias makes it easier to get the right answer. Bias isn't guaranteed to be bad.) If a multiple-feature covariance has propagated through many generations of many species, is it no longer interesting?

ADD COMMENT
1
Entering edit mode
14.1 years ago
Bilouweb ★ 1.1k

In the following article, the authors try to derive residue contacts from correlated columns in multiple alignments. They use Mutual Information (+ a direct coupling measure), not statistical coupling. But this can helps you to find an answer.

Martin Weigta, Robert A. Whitea, Hendrik Szurmantc, James A. Hochc, and Terence Hwaa. Identification of direct residue contacts in protein–protein interaction by message passing

I also found some interesting articles about the use of correlations in multiple alignment columns :

Andreas Kowarsch, Angelika Fuchs, Dmitrij Frishman, Philipp Pagel Correlated Mutations: A Hallmark of Phenotypic Amino Acid Substitutions

Fodor AA, Aldrich RW Influence of conservation on calculations of amino acid covariance in multiple sequence alignments.

ADD COMMENT

Login before adding your answer.

Traffic: 1102 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6