Question

Retrieve Protein Sequences from Uniprot

0

Entering edit mode

4.3 years ago

biohacker_tobe ▴ 80

I have a list of protein id's, these all traceback to Uniprot. However, I wanted to know if I can obtain sequence information from these proteins from uniprot protein ids.. Is there any package in biopython to do this?

I found this snippet of code online and it does give the sequence information but not sure if there is a better way

import requests as r from Bio import SeqIO from io import StringIO

cID='A0A061AD41'

baseUrl="http://www.uniprot.org/uniprot/" currentUrl=baseUrl+cID+".fasta" response = r.post(currentUrl) cData=''.join(response.text)

Seq=StringIO(cData) pSeq=list(SeqIO.parse(Seq,'fasta'))

where pSeq prints:

[SeqRecord(seq=Seq('MQAALIGLNFPLQRRFLSGVLTTTSSAKRCYSGDTGKPYDCTSAEHKKELEECY...SSS', SingleLetterAlphabet()), id='sp|O45228|PROD_CAEEL', name='sp|O45228|PROD_CAEEL', description='sp|O45228|PROD_CAEEL Proline dehydrogenase 1, mitochondrial OS=Caenorhabditis elegans OX=6239 GN=prdh-1 PE=2 SV=2', dbxrefs=[])]

sequence python biopython uniprot • 1.6k views

ADD COMMENT • link updated 4.3 years ago by Shalu Jhanwar ▴ 540 • written 4.3 years ago by biohacker_tobe ▴ 80

score 1 · Answer 1 · 2020-09-21

1

Entering edit mode

4.3 years ago

Shalu Jhanwar ▴ 540

Have a look at the previous post here showing retrieval of the sequences from UniProt protein Ids.

ADD COMMENT • link 4.3 years ago by Shalu Jhanwar ▴ 540

0

Entering edit mode

I saw that this uses linux command directly, was curious if I could do it directly from a python command.

ADD REPLY • link 4.3 years ago by biohacker_tobe ▴ 80