I want to compare two MSAs of the same set of protein sequences and determine which is better(I do not have the 'true alignment'). One way is the sum of pairs method. But I think sum of pairs is defined for a column in an alignment(average of scores of all pairs of residues). So, how do we calculate the score for the whole alignment? Do we again average the average scores obtained for each column?
Consider:
A-K
VVA
CVK
If this is the alignment and I am using BLOSUM62 as the scoring matrix.
The SOP for column 1 will be sc(AV)+ sc(VC)+ sc(AC)= a1 (let)
Similarly, for column 2 will be sc(-V)+sc(VV)+sc(-V)=a2
and then for column 3 will be sc(KA)+sc(AK)+sc(KK)=a3
Now, for the score of the MSA, do I take the average of a1, a2, a3?
And then use this score as a metric to compare between two MSAs?
Please let me know of any papers or resources that talk about this. Thanks
When you say you want to determine which alignment, I am not sure whether you imply you know a priori if one is better based on any criteria (not explained in your post).
But this question reminds me of qscore from Robert Edgar (author of MUSCLE software). http://www.drive5.com/qscore/. It can compare two alignments. I used this a while back, and it is UNIX compatible. I dont think there are Mac or Win versions, if that matters to you at all. I am not familiar with alistat or MstatX ideas posted by trausch.
Hi Anand. I do not have a reference alignment (have updated the question) hence I cannot use Edgar's qscore.
Thanks for the link. Is there any document or steps to run qscore code?? I am pretty much lost. Could you please direct me?