Hi, I've been using the ProtParam tool from Expasy to look at predictions of protein half life. Is there a way to download this tool for bulk calculations? Or is there any other tool out there that calculates this property?
Hi, I've been using the ProtParam tool from Expasy to look at predictions of protein half life. Is there a way to download this tool for bulk calculations? Or is there any other tool out there that calculates this property?
I have used a biopython based script for such a bulk calculation earlier. The Protparam is a sub-module of Seq.Utils. A sample script is available in the docs, you may access the biopython docs for the module here
ExPASy have done an excellent job of keeping their tools hidden behind a very old-fashioned web interface for many years now. It's starting to look old and tired.
I would treat the half-life prediction with some caution. It assumes that the N-terminus is not modified. It's also one of the earlier tools to be developed for sequence analysis, based on a small amount of rather old biochemical data: I don't know when it was last updated. A nice review containing biochemical data on the relation of the N-terminus to protein stability is: Processed N-termini of mature proteins in higher eukaryotes and their major contribution to dynamic proteomics.
Apart from the Biopython tool that Khader mentioned, I'm not aware of other standalone tools. You could resort to using a mechanize library to automate submission at the web site. Here's an example using Ruby's mechanize:
#!/usr/bin/ruby
require "rubygems"
require "mechanize"
seq = "MEEPQSDLSIELPLSQETFSDLWKLLPPNNVLSTLPSSDSIEELFLSENVTGWLEDSGGALQGVAAAAAS"
agent = Mechanize.new
page = agent.get('http://au.expasy.org/tools/protparam.html')
form = page.form_with(:action => '/cgi-bin/protparam')
form.sequence = seq
result = form.submit
Unfortunately, the returned HTML is ugly and difficult to parse (lacking div
, class
, etc.) If you split the body text on <B>
, array element 11 contains the half-life data (but don't rely on that, obviously!)
text = result.body.split("<B>")
puts text[11]
Estimated half-life:</B>
The N-terminal of the sequence considered is M (Met).
The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).
>20 hours (yeast, in vivo).
>10 hours (Escherichia coli, in vivo).
And then you'd want to clean that up with some regex and parsing.
At the Center for Biological Sequence Analysis in Denmark, we managed to get hold of copy of the ExPASy protparam script developed at Swiss Institute of Bioinformatics. Our contact person was Elisabeth Gasteiger - you might want to contact her if you want to be sure that your bulk calculation of parameters is done exactly the same way as in the web interface.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.