get the uniprot accession from ensembl protein ID
2
3
Entering edit mode
9.2 years ago
Moses ▴ 150

Hi,

I have a list of Ensembl protein pairs(homologs) with their respective ensembl protein IDs. I want to find a way to convert these Ensembl protein IDs to Uniprot IDs. for example given: ENSP00000361930 I want to get: 1433B_HUMAN

Is there a function in biopython that does this conversion? I really need to script it because the list is huge and I cant do it manually.

Thank you

annotation uniprot ensembl genome protein • 5.5k views
ADD COMMENT
1
Entering edit mode
9.2 years ago

use uniprot ID mapping http://www.uniprot.org/uploadlists/

ADD COMMENT
0
Entering edit mode

well I thought about that but whenever I give a file with list of ensembl protein IDs for example:

ENSP00000379287
ENSG00000166913
ENSP00000355042
ENSP00000379287
ENSP00000379287

then its giving me fasta files as outputs with protein sequences accession numbers and other information that I do not need

>tr|A0A0J9YWE8|A0A0J9YWE8_HUMAN 14-3-3 protein beta/alpha OS=Homo sapiens GN=YWHAB PE=4 SV=1
MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSS
WRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLFFRMPHSKTTLRKYCSVYEA
WTPSELLLLSCWTNILFPMLHNQKVRCST
>tr|A0A0J9YWZ2|A0A0J9YWZ2_HUMAN 14-3-3 protein beta/alpha (Fragment) OS=Homo sapiens GN=YWHAB PE=4 SV=1
XAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTERNEKKQQMGKEYR
EKIEAELQDICNDVLVHLVFR
>sp|P31946|1433B_HUMAN 14-3-3 protein beta/alpha OS=Homo sapiens GN=YWHAB PE=1 SV=3
MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSS
WRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLELLDKYLIPNATQPESKVFY
LKMKGDYFRYLSEVASGDNKQTTVSNSQQAYQEAFEISKKEMQPTHPIRLGLALNFSVFY
YEILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGD
AGEGEN
>sp|P31946-2|1433B_HUMAN Isoform Short of 14-3-3 protein beta/alpha OS=Homo sapiens GN=YWHAB
MDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSSWR
VISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLELLDKYLIPNATQPESKVFYLK
MKGDYFRYLSEVASGDNKQTTVSNSQQAYQEAFEISKKEMQPTHPIRLGLALNFSVFYYE
ILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGDAG
EGEN
>tr|Q4VY19|Q4VY19_HUMAN 14-3-3 protein beta/alpha (Fragment) OS=Homo sapiens GN=YWHAB PE=1 SV=1
MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSS
WRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVL
>tr|Q4VY20|Q4VY20_HUMAN 14-3-3 protein beta/alpha (Fragment) OS=Homo sapiens GN=YWHAB PE=1 SV=1
MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSS
WRVISSIEQKTERN

for example. Is there a way to just get the protein ID rather than all this info?

ADD REPLY
4
Entering edit mode

button "Download" -> Format: "Mapping Table"

ADD REPLY
3
Entering edit mode
2.3 years ago

gget info!! https://github.com/pachterlab/gget

ADD COMMENT
2
Entering edit mode

Perfect! Why am I just hearing about this now? :)

ADD REPLY

Login before adding your answer.

Traffic: 2111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6