Dear all,
I have a question regarding conservation scores. In a well-known paper from 2007, Capra and Sing (https://doi.org/10.1093/bioinformatics/btm270) proposed the Jensen-Shannon divergence as a score for sequence conservation. They also compare different conservation scores that include substitution matrices and property based scores.
Now my question is, does it make sense to score the Jensen-Shannon divergence additionally by a substitution matrix or a property-based grouping of amino acids? I know that there is a score that does this with the relative entropy (Kullback-Leibler entropy, Williamson, 1995).
If no what would speak against it?
Best,
Jonathan
Thanks Mensur!
I would like to find how conserved calcium binding motifs are. So I thought it would be great to have a score that incorporates properties like acidity etc. From the literature it seems that entropy-based scores perform very well and they are easy to implement, e.g. in a python script.
But my question was rather in the direction, whether it makes sense to additionally score a relative entropy-based score against a substitution matrix. I was wondering why no one has ever done this. Only thing I have read so far is the property grouping for those entropy-based scores, so I was wondering whether there is a statistical reasoning to why not to do this.