Entrez.efetch HTTP Error 400: Bad request
1
1
Entering edit mode
7.0 years ago

Hey everyone,

I have a weird problem. I am trying to download a number of assemblies from Genbank using Entrez.eftech. Here is my code:

from Bio import Entrez, SeqIO
import csv, sys, os, time, shutil
import httplib, urllib2

Entrez.email="mymail@gmail.com"

def download_genomes():

    #Search for all bacterial assemblies in the assembly database and get their ids
    search_term= "bacteria[orgn] AND all[filter]"
    handle=Entrez.esearch(db="assembly", retmax=500000, term=search_term)
    genome_id=Entrez.read(handle)['IdList']
    print "Fetched Id list..."

    for id in genome_id:

        while True:
            try:
                #Fetch the enrty corresponding to the id
                record=Entrez.efetch(db='assembly', id=id, rettype='fasta', retmode='text')
                time.sleep(3)
                seq_record=Entrez.efetch(db='assembly', id=id, rettype='gbwithparts', retmode='text')
                seq_meta=SeqIO.read(seq_record, "genbank")
                .            
                .  
                .
                #Skipped rest of the code which writes downloaded files to spec. dirs and so on

However, when running this, get a HTTP Error 400: Bad request for every id. When trying out the ids manually in Genbank, the entry is found. Does somebody know what could be going on here? I would appreciate the help!

Cheers!

Biopython Entrez • 7.0k views
ADD COMMENT
0
Entering edit mode

NCBI moved to exclusive https connections last year. Are you using the latest version of all modules?

ADD REPLY
0
Entering edit mode

Yes that sounds like a very good explanation. I don't think your code is wrong, 400 error's usually have something to do with the server (wrong URL, moved resource, etc) ....

ADD REPLY
0
Entering edit mode

Any idea what could be wrong? I can find the genome in question with the id fetched by entrez, so I don't think the resource itself was moved...

ADD REPLY
0
Entering edit mode

Hej, thanks for the fast reply! I am using biopython 1.69, which is the latest version supported by anaconda to my knowledge...

ADD REPLY
0
Entering edit mode

Did it work recently, e.g. yesterday and just suddenly stopped working? If so .... maybe they just have a temporal problem with a server or something, that can happen as well.

ADD REPLY
0
Entering edit mode
7.0 years ago

I did not get it to work using entrez, so I just wrote another script that uses wget to get the sequences via https (based on the instructions on https://www.ncbi.nlm.nih.gov/genome/doc/ftpfaq/#downloadservice , in the section 'To use HTTPS'). It works only properly when I set a time.sleep() statement behind every request, but it now seens to work.

Thanks for the help!

Cheers

ADD COMMENT

Login before adding your answer.

Traffic: 1535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6