Hi
I am reading the paper Fold Recognition by Predicted Alignment Accuracy. In this paper the author first align the the input sequence with the sequence of a template protein.
Then in order to define an structural similarity between the sequence and that protein template, first predict the secondary structure for the sequence with a tool such as PSIPRED. PSIPRED assign three values to each residue (call it I
) of the input sequence: Alpha_Helix(I
), Beta_Sheet(I
), Loop(I
). We can see this values as level of confidence for each residue I
to be in Alpha Helix, Beta Sheet or Loop region (consider their sum equal to one).
The contribution of paper is that for an aligned pair of residues for example residue J
from the sequence and residue K
from the template the algorithm define a structural similarity measure between this pair as below:
(remember that we calculated J
s confidence for being in each region before with PSIPRED and also we know K
s region because it is a residue from the template protein that We know everything about that.)
if K be in alpha-helix region of template
similarity(J,K)=Alpha_Helix(J)-loop(J);
if K be in beta-sheet region of template
similarity(J,K)=Beta_Sheet(J)-loop(J);
What I can't understand is why We reduce the loop(J
) from its alpha_helix(J
) or beta_sheet(J
)?I think It should has a biological background but I don't know what.What I think to be true is:
if K be in alpha-helix region of template
similarity(J,K)=Alpha_Helix(J);
if K be in beta-sheet region of template
similarity(J,K)=Beta_Sheet(J);