Entering edit mode
9.3 years ago
Maxime B
▴
10
I'm trying to download a marker sequence (ITS) for all the plant species using biopython and NCBI esearch/efetch.
It works well, however not only it is slow, but it stops after a few hundred downloads (my plant list is 86 000 species long), probably because of the NCBI download policy...
To prevent this, I'd like to use esearch/efetch on a local database, but how ? I know I can download the NCBI databases from ftp://ftp.ncbi.nlm.nih.gov/ but how to interface it with esearch/efetch/biopython locally?
Thanks
Not really sure about why do you want to use biopython for such a task. You can just download the corresponding flatfiles and write up a custom parser for extracting the information you need.
Because it allows me to filter easily on the length of the sequence, and (mainly) because I had already written the piece of code a while ago.
But sure, I'll go with regexp if there is no esearch/efetch on local database...
Entrez utilities only work via the web. There is no way to provide that exact functionality locally.