Programmatic download of complete genomes from NCBI for specific taxonomy identifier
1
2
Entering edit mode
7.6 years ago
marc.bourqui ▴ 20

Hi all,

I am looking for a programmatic way to download complete genomes from RefSeq for a specific taxonomy identifier, in my case I am interested in the Lactobacillales order.

  • From the Genome Download FTP it is not possible to filter by the order, only genus and species.
  • I tried to follow the steps described in this post. First two steps (esearch and elink) are okay, but then I do not know how to select my genomes of interest according to my criterion (complete and from RefSeq).
  • I also tried the Ebot pipeline genertaor, but then again I am not sure about the query qualifiers to apply.
  • My best approach so far is to use the Genome browser. From there, I can apply my filters and then download the selected records as a .csv or .txt. How can I get the same output file without using the web interface?

Thanks in advance for any hints and help!

genome ncbi • 2.0k views
ADD COMMENT
3
Entering edit mode
7.6 years ago

Try this, it should work.

  1. get taxids of species belonging to an order using taxonkit list ,
  2. and download all refseq complete geomes according to the species_taxid refering to this post (section "Filter by species_taxid") .
ADD COMMENT

Login before adding your answer.

Traffic: 1887 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6