Question

CURL UniprotIDs not working consistently

0

Entering edit mode

3.8 years ago

nimojose • 0

Hello folks,

I am going down the Rosalind problem sets, and I am currently trying to obtain the protein amino acid sequences from Uniprot using the Uniprot IDs given.

My method was using the ID and concatenating it into the Uniprots website. The first two ids will not work, while the second two work perfectly. I tried running curl -I to look for any hint of whats wrong , but it does not help. I am now working on making this work with the Python 3 API directly, hopefully that works.

ids2=['P01866','P81448', 'Q640N1', 'Q0TMT1']
with open('output.txt', 'w') as f:
    for i in ids2:
        p2=subprocess.run(['curl','-s',f'https://www.uniprot.org/uniprot/{i}.fasta'], stdout=f , text=True)

curl uniprot python bash • 1.2k views

ADD COMMENT • link updated 3.8 years ago by GenoMax 151k • written 3.8 years ago by nimojose • 0

score 1 · Answer 1 · 2021-07-07

In case of doubt, I recommend to try your query on the UniProt website https://www.uniprot.org If you query for the entries individually, or using the Batch retrieve service (https://www.uniprot.org/uploadlists), you will find that the first two ACs in your list are secondary accession numbers, and the URL https://www.uniprot.org/uniprot/{i}.fasta do not work for secondary accession numbers:

secondary  primary  
P01866     P01867   IGG2B_MOUSE Ig gamma-2B chain C region Igh-3
P81448     P13727   PRG2_HUMAN  Bone marrow proteoglycan PRG2 MBP

An accession number becomes secondary if the entry is merged into another existing entry (see https://www.uniprot.org/help/accession_numbers).

For interactive users of the website, the entry view for secondary accession numbers is redirected for convenience. For programmatic access the "/uniprot/secondary_ac.fasta" REST URLs do not work, among other reasons to warn programmatic users that there is something that deserves a closer look.

This query works a bit differently and redirects to the correct current URL:

https://www.uniprot.org/uniprot/?query=id:P01866&format=fasta

redirects to

https://www.uniprot.org/uniprot/P01867.fasta