Question

Entrez.efetch HTTP Error 400: Bad request

1

Entering edit mode

7.0 years ago

wanderingstefan ▴ 30

Hey everyone,

I have a weird problem. I am trying to download a number of assemblies from Genbank using Entrez.eftech. Here is my code:

from Bio import Entrez, SeqIO
import csv, sys, os, time, shutil
import httplib, urllib2

Entrez.email="mymail@gmail.com"

def download_genomes():

    #Search for all bacterial assemblies in the assembly database and get their ids
    search_term= "bacteria[orgn] AND all[filter]"
    handle=Entrez.esearch(db="assembly", retmax=500000, term=search_term)
    genome_id=Entrez.read(handle)['IdList']
    print "Fetched Id list..."

    for id in genome_id:

        while True:
            try:
                #Fetch the enrty corresponding to the id
                record=Entrez.efetch(db='assembly', id=id, rettype='fasta', retmode='text')
                time.sleep(3)
                seq_record=Entrez.efetch(db='assembly', id=id, rettype='gbwithparts', retmode='text')
                seq_meta=SeqIO.read(seq_record, "genbank")
                .            
                .  
                .
                #Skipped rest of the code which writes downloaded files to spec. dirs and so on

However, when running this, get a HTTP Error 400: Bad request for every id. When trying out the ids manually in Genbank, the entry is found. Does somebody know what could be going on here? I would appreciate the help!

Cheers!

Biopython Entrez • 7.0k views

ADD COMMENT • link updated 21 months ago by Ram 44k • written 7.0 years ago by wanderingstefan ▴ 30

0

Entering edit mode

NCBI moved to exclusive https connections last year. Are you using the latest version of all modules?

ADD REPLY • link 7.0 years ago by GenoMax 148k

0

Entering edit mode

Yes that sounds like a very good explanation. I don't think your code is wrong, 400 error's usually have something to do with the server (wrong URL, moved resource, etc) ....

ADD REPLY • link 7.0 years ago by LLTommy ★ 1.2k

0

Entering edit mode

Any idea what could be wrong? I can find the genome in question with the id fetched by entrez, so I don't think the resource itself was moved...

ADD REPLY • link 7.0 years ago by wanderingstefan ▴ 30

0

Entering edit mode

Hej, thanks for the fast reply! I am using biopython 1.69, which is the latest version supported by anaconda to my knowledge...

ADD REPLY • link 7.0 years ago by wanderingstefan ▴ 30

0

Entering edit mode

Did it work recently, e.g. yesterday and just suddenly stopped working? If so .... maybe they just have a temporal problem with a server or something, that can happen as well.

ADD REPLY • link 7.0 years ago by LLTommy ★ 1.2k

score 0 · Answer 1 · 2017-12-13

I did not get it to work using entrez, so I just wrote another script that uses wget to get the sequences via https (based on the instructions on https://www.ncbi.nlm.nih.gov/genome/doc/ftpfaq/#downloadservice , in the section 'To use HTTPS'). It works only properly when I set a time.sleep() statement behind every request, but it now seens to work.

Thanks for the help!

Cheers