I am looking for some sort of protein structure prediction method that will give me a value for a peptide's probability of being a beta-sheet. I would like to give the program stretches of amino acids about 20aa in length, and have it return a number that I can use. This number would be a parameter I would use in an evolutionary algorithm. This algorithm is already computationally expensive, so I would like some relatively fast way of calculating this value from short stretches (about 20aa), rather than the full protein. Python / Perl would be preferable.
What would you think of the reliability of a calculation like this? My algorithm currently uses, molecular weight (somewhat predictor of 'bulkiness') and isoelectric point of a short stretch, but reviewers have told us to incorporate some information about the proteins structure. Therefore, it doesn't necessarily have to be a beta-sheet prediction. Any ideas?
Thank you. That could be helpful. I know it is better to use a complete sequence, but the program will end up being very computationally expensive because I will be repeating it thousands of times. I would like to find the '3D structure value' of wild-type protein and evolve random sequences towards that in-silico. But thanks, I will look into those resources and post my answer if I find one.
I really would not worry about the computation. "Real" time for garnier on my system is milliseconds per sequence, so "thousands of times" will be minutes, or less. Secondary structure prediction is not intensive.
Ok that is good to know. I've downloaded the C source for DSC (Discrimination of protein Secondary structure Class). I think if I could get this to work in the command line, I could come up with a way to use it in my perl program. It outputs the probability that each amino acid in your sequence is in a Helix, Coil, or Beta Sheet. Any idea of how to run DSC on a mac?
Here under software: http://www.aber.ac.uk/en/cs/research/cb/dss/
Got it running.