I came across an interesting problem yesterday, and, while I solved it in my own way, I thought it would make an interesting challenge for you good folks.
In a situation where you have only the name for a protein in UniProt (rather than the accession), it is perfectly possible to visit a URL such as: http://www.uniprot.org/uniprot/OPSD_HUMAN and the UniProt website will redirect you to http://www.uniprot.org/uniprot/P08100, which is, obviously, the UniProt entry for the protein concerned.
If you have a list of such names, and you want to get FASTA sequences for them, to me the obvious thing to do would be to go to http://www.uniprot.org/uniprot/OPSD_HUMAN.fasta, which in light of the previous behaviour, I would expect to redirect to http://www.uniprot.org/uniprot/P08100.fasta. Unfortunately this is not the case, the .fasta URL also redirects to the UniProt entry for a protein, not to the sequence.
I would imagine it is a one-line fix to "correct" this behaviour, and I was wondering if the Biostar-ers could come up with a solution of a similar length (rather than my crufty Python solution).
So, to clarify - a way to retrieve FASTA sequences for a list of UniProt protein names. An example list can be downloaded from here.
Some rules - there are no rules - any language, any library, any tool. I'm very interested in how broad a range of solutions we can produce. Obviously there is no correct answer, so the highest scoring response by the end of the week (i.e. 5pm Friday UK time, or as close to) gets the accepted answer bonus.
Your "using an already available webform" solution of course wins against my script ;-)
Well, I can get mine down to 95 characters: Go to uniprot.org, click Retrieve tab. Paste list in box, click Retrieve button. Choose output. [?] And, may I add, that mine outputs as web interface, FASTA, GFF, Flat text, xml, rdf/xml, or a list of the corresponding IDs.
Heh. Yeah, but that wasn't ruled out! There were no rules--it said any tool! You didn't have to write the platform you are using, right? Kidding. I'm just trying to be the voice of a non-coder here. If this site is to be broadly useful to people, scripts may not always be the only answer.
Scripts are the only answer. But in your case, one of the uniprot devs did the writing ;-)
yep, it's always a code or script that does the work, whether already done in a web-based tool or newly created. But, no use in reinventing the wheel (unless it's a better wheel of course) :D.
ROFL. Yeah, no one would ever want to reuse anyone else's existing code...wait a minute.... (PS: I can't believe that my answer has a positive value score. I expected to get downrated for this answer!)