UniProtKB - mapping gene name to ID (*_HUMAN ) using python2
1
0
Entering edit mode
6.0 years ago
jg • 0

Hello!

I have a large list of kinase (gene) names extracted from the UniProtKB modified residue section. (e.g. MAPK1, CDK1, SRC, ATM etc.)

I am trying to convert these names to their entry names (ID) to get:

MAPK1: MK01_HUMAN

CDK1: CDK1_HUMAN

SRC: SRC_HUMAN

etc...

The problem is that for ever gene name I get many IDs and I only want the official one (see below) which always seems to be the only one reviewed.

I tried adding 'columns': 'reviewed' and 'organism': 'human' in the params below but it has no effect. I am basically lost!

An example using MAPK1 kinase:

import urllib,urllib2

    url = 'https://www.uniprot.org/uploadlists/'

    params = {
    'from':'GENENAME',
    'to':'ID',
    'format':'tab',
    'query':'MAPK1',
    'columns': 'reviewed'
    }

    data = urllib.urlencode(params)
    request = urllib2.Request(url, data)
    contact = "xxxx@outlook.com" 
    request.add_header('User-Agent', 'Python %s' % contact)
    response = urllib2.urlopen(request)
    header = response.readline()
    entries=response.read()

    id_list=[]
    new_entries=entries.split("\n")
    for element in new_entries:
        if element=="":
            continue
        else:
            element=element.split("\t")
            if "_HUMAN" in element[1]:
                id_list.append(element[1])

The final id_list is: ['MK01_HUMAN', 'Q1HBJ4_HUMAN', 'Q499G7_HUMAN']

I am only interested in extracting the 'main' identifier; MK01_HUMAN. Please can anyone help?

uniprot databases parsing python api • 2.4k views
ADD COMMENT
3
Entering edit mode
6.0 years ago
vkkodali_ncbi ★ 3.8k

Try changing your url to https://www.uniprot.org/uniprot/ and params to:

{
    'query': 'gene_exact:mapk1 AND organism:homo_sapiens AND reviewed:yes', 
    'format': 'tab', 
    'columns': 'id,entry_name,genes'
}
ADD COMMENT
0
Entering edit mode

Thank you so much - it worked. Now I can sleep peacefully after hours of staring at this :)

ADD REPLY
0
Entering edit mode

@vkkodali I get error like

url = "https://www.uniprot.org/uniprot/" ^ IndentationError: unexpected indent

ADD REPLY
0
Entering edit mode

Python seems to be complaining about indentation. If you have copy/pasted the code from above, you should make sure the indentation is correct. I don't think you need to indent the entire code block after the import statement.

ADD REPLY
0
Entering edit mode

@vkkodali I just correct the indent but it runs and does not give any output

ADD REPLY
0
Entering edit mode

Without more detailed information from you, I cannot be of much help. What have you tried? How did you run the code? Did you run it as a script? Or did you run it at the python interpreter? Did you use the example posted here or did you use your own example? I am not sure what you mean by 'does not give any output'. What were you expecting? The code, as written in the first post, does not output anything. It populates a list called id_list. Did you see anything in that list?

ADD REPLY

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6