use bioservices.UniProt (python) to map uniprot accession to [multiple] ensembl ids
1
0
Entering edit mode
12 months ago
mk ▴ 300

I am just trying to get a (one to many) mapping from uniprot accession -> Ensembl ids.

In the spirit of exploration, I have the following code to use bioservices.UniProt pull all possible columns for the selection of uniprot accession ids.

from bioservices import UniProt
u = UniProt()
accession_numbers = ['P30561', 'P53762']
result = u.get_df(accession_numbers)
type(result)
result.columns.tolist()

I know that this mapping exists in UniProt, here is the annotation field on the uniprot website for the first accession (P30561):

enter image description here

However, going through the output columns of result above, the column labeled "Annotation" has the following contents:

0    5.0
1    5.0
2    5.0
3    5.0
4    5.0
5    5.0
6    4.0
7    1.0

This is clearly not a list of Ensembl ids

bioservices ensembl python uniprot • 740 views
ADD COMMENT
1
Entering edit mode

The Annotation field seems to the correspond to the Annotation Score, which you would have figured out with a little bit of effort.

result[['Entry','Annotation']]
    Entry  Annotation
0  P30561         5.0
1  P53762         5.0
2  O94763         5.0
3  P97481         5.0
4  P79832         5.0
5  Q61221         5.0
6  Q61045         4.0
7  G7ZFL7         1.0

Annotation score for:

Look for a column that can take you to the NCBI/Entrez Gene ID. There is no basis to assume that a column titled Annotation will give you a gene ID.

ADD REPLY
2
Entering edit mode
12 months ago
mk ▴ 300

Turns out this works:

from bioservices import UniProt
u = UniProt()
accession_numbers = ['P30561', 'P53762']
u.mapping(fr='UniProtKB_AC-ID',to='Ensembl',query=accession_numbers)
ADD COMMENT

Login before adding your answer.

Traffic: 2123 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6