Hello everyone,
I want to retrieve the uniprot identifiers from the entrez gene ID, I'm trying it programmatically with the following script:
import urllib,urllib2
url = 'http://www.uniprot.org/mapping/'
params = {
'from':'P_ENTREZGENEID',
'to':'ACC',
'format':'tab',
'query':'88',
'fil':'reviewed3%Ayes',}
data = urllib.urlencode(params)
request = urllib2.Request(url, data)
response = urllib2.urlopen(request)
page = response.read(200000)
The problem is that when doing it with and without the filter (reviewed and organism) makes no difference and have to.
The output for this query (88) with the same filters on the Uniprot service for ID mapping is just one identifier = P35609
On the other hand when using the script the results are = F6THM6, P35609, Q59FD9 that are the same results as the one obtained from the web without any filter.
Hope my problem was clearly explained. If possible I would like a programmatic answer.
Using Entrez gene ID 88 as query on Uniprot's page you linked to gives me F6THM6, P35609, Q59FD9 as results so your scripts gives the correct result in this particular case.
Yeah using 88 as query on Uniprot's page the result are those 3 identifiers but when using the filter only reviewed the result is just P35609. And in theroy the script have to give only the reviewed but it doesn't
I read too quickly and missed the bit about the filter. Shouldn't 'reviewed3%Ayes' be 'reviewed=yes' ? urlencode should take care of the encoding (i.e. converting = to %3D) and I don't know of a character with code 3%A.
I've tried all the possible ways but I didn't get what I want. In the end I generate a list with the Gene ID and I did it by hand using the option of exporting a list that Uniprot provide.
Thank you for the help