Hi all,
I hope this question makes sense. I have a multiple sequence alignment and I'd like to propose positions that could have a functional relevance. Some positions are fully, or almost fully, conserved and I can quantify the conservation by means of Shannon entropies.
However, I've found out that some people is measuring evolutionary rates of positions, for example with Rate4Site. Here, slowly evolving positions are thought to be relevant. To what extend this notion is different from residue conservation? If I want to find residues that may have functional or structural relevance, what would you guys measure?
Thanks!
Ultimately both methods are trying to give you an idea of functional relevance through primary sequence alignment. You are using information content to quantify conservation while Rate4Site first makes a tree and then use that information to calculate conservation.
How exactly are you going to use Shannon's entropies to ascertain functional relevance? Just anything with low entropy is conserved? What is your threshold and how did you come up with that?
While in theory methods that take into account the evolutionary tree should outperform information-theory based approaches, Rate4Site has not been shown to outperform those significantly. For using SE (+variants) as a conservation score, there are numerous publications showing decent performance (see my post below).
Rate4Site is ok, but it is only calculating the site rate, and with a rather limited model of evolution at that. I would also point out that the Capra method, while primarily an Information Theoretic measure, does I believe, attempt to weight it's conservation score based on the distance between sequences. So it is attempting to weight based on the amount of diversity within and between the samples.
I would also take all of the performance measures with a grain of salt. In my experience the selection of datasets to test on is highly biased and often downright uninformative. While I was working on predicting functional divergence and shifts in functional importance, I explored this issue a little bit late last year:
http://www.ncbi.nlm.nih.gov/pubmed/21840876
Admittedly, my naive approach was to rank by entropy and manually explore the top-ranking positions (the "more conserved") for functional or structural rationale.