I was interested in getting the FT features for a bunch of kinases. for instance for AKT1 I would get "FT DOMAIN 150 408 Protein kinase." from the file available at http://www.uniprot.org/uniprot/P31749.txt.
So I was wondering if parsing the dedicated Uniprot text annotation file is the only way to get this information. Or this information is also stored and available in a publicly accessible database.
There are quite a few ways to get this information out of the uniprot website.
Please write to help@uniprot.org
But for example this via the rest interface. I am out of the office and won't have time to write a complete answer until next week (12th of December 2011)
save as yourName.py. Use by: python yourName.py accessionIDsList
This script will basically go through each accession id in the list, request the entry and display the feature, count of domains, and domain name in a tab delimited format. If you want to display other information, check out the REST service FAQ and add in your own columns in the url.
You can download the flat file containing all Swiss-Prot proteins here [1]. To parse that file, I'd use sth like Biopython which makes it easy to retrieve the feature section of each protein.
Very very nice !! I am in a hurry to be december 12th. I didn't know about this REST possibility
My colleague @Elisabeth_Gasteiger hopefully answered your question. @Pierre_Lindenbaum also gave a good answer.