Retrieve all reviewed proteins from a list of species names
0
0
Entering edit mode
2.0 years ago
Francesco ▴ 20

hi, i have to retrieve all reviewed proteins from a list of species names (in a csv file. 646 species) on uniprot. i tried to use uniprot api service and unipressed (python library). This is che script i wrote:

from unipressed import UniprotkbClient
import pandas as pd


data_df = pd.read_csv('organisms.csv', header=0, sep=',')

species_names = (data_df['Species'])

species_names = (species_names.dropna())

for record in UniprotkbClient.search(
    query = {

        'organism_name' : s for s in species_names

}).each_record():
   display(record)

unfortunately, the for loop is unable to retrieve all proteins of all species (it downloads only proteins of the last specie). i tried to add an AND condition to download only reviewed proteins but i get an error message. please help me :')

uniprot api rest proteins • 859 views
ADD COMMENT
2
Entering edit mode

Have you seen the examples that UniProt has for Python queries : https://www.uniprot.org/help/api_queries

If you are trying to download a large amount of data from public resources you will want to put some pauses between the species to prevent overloading the servers/getting your IP banned.

ADD REPLY
0
Entering edit mode

thank you!

ADD REPLY
0
Entering edit mode

I have a script here, which you can modify: Need help to retrive sequences

You just need to change the tax ID encoded in the script. Currently it is set to txid9606 (Homo sapiens)

Kevin

ADD REPLY

Login before adding your answer.

Traffic: 2484 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6