Hey everyone,
I have a weird problem. I am trying to download a number of assemblies from Genbank using Entrez.eftech. Here is my code:
from Bio import Entrez, SeqIO
import csv, sys, os, time, shutil
import httplib, urllib2
Entrez.email="mymail@gmail.com"
def download_genomes():
#Search for all bacterial assemblies in the assembly database and get their ids
search_term= "bacteria[orgn] AND all[filter]"
handle=Entrez.esearch(db="assembly", retmax=500000, term=search_term)
genome_id=Entrez.read(handle)['IdList']
print "Fetched Id list..."
for id in genome_id:
while True:
try:
#Fetch the enrty corresponding to the id
record=Entrez.efetch(db='assembly', id=id, rettype='fasta', retmode='text')
time.sleep(3)
seq_record=Entrez.efetch(db='assembly', id=id, rettype='gbwithparts', retmode='text')
seq_meta=SeqIO.read(seq_record, "genbank")
.
.
.
#Skipped rest of the code which writes downloaded files to spec. dirs and so on
However, when running this, get a HTTP Error 400: Bad request
for every id. When trying out the ids manually in Genbank, the entry is found. Does somebody know what could be going on here? I would appreciate the help!
Cheers!
NCBI moved to exclusive
https
connections last year. Are you using the latest version of all modules?Yes that sounds like a very good explanation. I don't think your code is wrong, 400 error's usually have something to do with the server (wrong URL, moved resource, etc) ....
Any idea what could be wrong? I can find the genome in question with the id fetched by entrez, so I don't think the resource itself was moved...
Hej, thanks for the fast reply! I am using biopython 1.69, which is the latest version supported by anaconda to my knowledge...
Did it work recently, e.g. yesterday and just suddenly stopped working? If so .... maybe they just have a temporal problem with a server or something, that can happen as well.