Hi,
I've been trying to use the BioPython Bio.SeqUtils.ProtParam module to perform bulk calculations of the instability index of proteins. The only problem with this approach is that the BioPython implementation doesn't allow the amino acid sequence to contain amino acid codes B, J, O, U, X and Z. However, the ExPASy ProParam tool allows any amino acid code to appear in the sequence submitted.
What I'd like to know is if anyone knows how the ExPASy tool handles the amino acid codes for which there is no stability data. I've tried removing the invalid amino acid codes from the sequence, and calculating the instability index based on the modified sequence. Unsurprisingly this didn't work as the instability calculation is based on dipeptides.
The ExPASy documentation is here, and the stability data comes from this paper.
Simon
Sorry, I forgot to say that I had attempted that as well. It still seems to give different answers. I do agree with you about the usefulness of the instability predictions though. At this stage it's probably more a sense of curiosity/completeness that's leading me to try and work this out.