Hello again! I'm trying to map Ensembl Proteins to UniProtKB using the REST API, but although I have 237 items in the list, only 25 results are returned.
Here is my code snippet:
ensembl_id_list = data_df["Ensembl_ID"].tolist() #Returns 237 IDs
ensembl_id_list = list(dict.fromkeys(ensembl_id_list)) #Removes duplicates
URL = 'https://rest.uniprot.org/idmapping'
params = {
'from': 'Ensembl_Protein',
'to': 'UniProtKB',
'ids': ' '.join(ensembl_id_list)
}
response = requests.post(f'{URL}/run', params)
job_id = response.json()['jobId']
job_status = requests.get(f'{URL}/status/{job_id}')
d = job_status.json()
ensembl_to_uniprot_dict = {}
#Retry getting results every 1 sec until they are ready
loop_end = ""
while loop_end != "Done":
if d.get("job_status") == 'FINISHED' or d.get('results'):
loop_end = "Done"
job_results = requests.get(f'{URL}/results/{job_id}')
results = job_results.json()
for obj in results['results']:
ensembl_to_uniprot_dict[obj["from"]] = obj["to"]
#print(f'{obj["from"]}\t{obj["to"]}')
break
time.sleep(1)
But when I query len(ensembl_to_uniprot_dict), it's only 25 entries long. What's going on and how do I fix it?
It works!! Thanks so much; adding that /size=500 fixed the issue. My list is 237 entries long, so it fits the limit. :)
Great. I moved my comment to the answer. You can accept the answer if it has solved your issue.