What is the way to retrieve genomes from ncbi via biopython? I am able to get a record for my genome of interest, and also i am able to download it manually from search
But how to download it inside a script?
from Bio import Entrez
Entrez.email = "my_email@email.ru"
handle = Entrez.esearch(db="genome", term="Drosophila eugracilis[Orgn]", idtype="acc")
record = Entrez.read(handle)
for i in record.keys():
print i,record[i]
You need
efetch
, see for example:https://stackoverflow.com/a/26347810/3691040
the problem with using nucleotide db is empy resulting list of id's:
and when using efetch with this genome id - it finds something else:
which is not a genome
Yeah that's weird, I see the same thing.
Have you tried your query directly on NCBI's website first to see what you get?
It seems you need to use
db="genome"
not nucleotide in yourefetch
.ID 6863 in
Nucleotide
points to that sRNA, but that same ID number inGenome
does correctly point to that Drosophila species (or find out what ID the drosophila genome is using insidenucleotide
)this is true - i used id from db=genome - i first found in in ncbi web server but when i change db to genomes in last request - it says
Hi there,
It is not possible to download the sequences directly from genome database, you will need to link to the actual sequence holding record using
elink
.Can you please give an example of using it in pipe with efetch to download genome? or point me on a tutorial page with it