Using BLAST API in Python

0

Entering edit mode

4.5 years ago

DdogBoss ▴ 20

I am trying to obtain full protein sequences by protein accession number.

I have the protein accession numbers in csv files, but am struggling on how to set up the URL.

Problem is that I also search the printed in url in web browser, and I land on blast webpage.

Sample query: query = "LHTPMY", dataset = "nr", and service = "blastp"

One example protein accession number could be : XP_023329844.1

What I have so far is :

import requests
def NCBI(query, dataset, service):
    request_base = "https://blast.ncbi.nlm.nih.gov/"
    request_base2 = "Blast.cgi?QUERY=" 
    query = str(query)
    database = "&DATABASE= %s" %dataset
    service = "&PROGRAM = %s" %service
    NCBI_GI = "&NCBI_GI = T"
    my_request = request_base + request_base2 + query + database + service + NCBI_GI
    print(my_request)
    r = requests.get(url=my_request)
    json_str = r.json()
    return json_str

Documentation here: https://ncbi.github.io/blast-cloud/dev/api.html

Resources and advice are welcome.

Thank you in advance.

BLAST Python • 2.6k views

ADD COMMENT • link updated 4.5 years ago by Ram 45k • written 4.5 years ago by DdogBoss ▴ 20

0

Entering edit mode

Can you not just use the existing Biopython implementation of NCBIWWW etc?

ADD REPLY • link 4.5 years ago by Joe 22k

Login before adding your answer.