Entering edit mode
4.2 years ago
biohacker_tobe
▴
80
I have a list of protein id's, these all traceback to Uniprot. However, I wanted to know if I can obtain sequence information from these proteins from uniprot protein ids.. Is there any package in biopython to do this?
I found this snippet of code online and it does give the sequence information but not sure if there is a better way
import requests as r from Bio import SeqIO from io import StringIO
cID='A0A061AD41'
baseUrl="http://www.uniprot.org/uniprot/" currentUrl=baseUrl+cID+".fasta" response = r.post(currentUrl) cData=''.join(response.text)
Seq=StringIO(cData) pSeq=list(SeqIO.parse(Seq,'fasta'))
where pSeq prints:
[SeqRecord(seq=Seq('MQAALIGLNFPLQRRFLSGVLTTTSSAKRCYSGDTGKPYDCTSAEHKKELEECY...SSS', SingleLetterAlphabet()), id='sp|O45228|PROD_CAEEL', name='sp|O45228|PROD_CAEEL', description='sp|O45228|PROD_CAEEL Proline dehydrogenase 1, mitochondrial OS=Caenorhabditis elegans OX=6239 GN=prdh-1 PE=2 SV=2', dbxrefs=[])]
I saw that this uses linux command directly, was curious if I could do it directly from a python command.