Entering edit mode
3.4 years ago
jmungar2
▴
10
I am using Belvu to calculate per-residue conservation scores but I could not find either in the manual nor in the SeqTools paper how conservation is calculated (Shannon entropies, similarity, etc).
For example, Belvu gives me sometimes low conservation values (2-3) for high values of percentage identity (over 80%), or give me a value of 4 for 100% sequence identity. Any idea of where can I find this information?
Thank you
I don't know the exact answer, but from what you wrote my guess is that the conservation is based on information content. You can read more about that by Googling
sequence logo
. The maximum information content for proteins is log[2] (20) = 4.321928095, where 20 is the number of amino-acids. For nucleotides it would be log[2] (4) = 2.Thank you for your response. Actually, I also get values way over 4.3, up to 11, so it's a bit confusing...