Get all completely sequenced genomes from one genus
1
0
Entering edit mode
6.8 years ago
bird77 ▴ 80

Is there an automatic way to get the fasta sequences of all sequenced (preferably completely) genomes within a taxonomic group?

And how can I get the taxid for all of these organisms as well?

Thank you.

genome • 1.4k views
ADD COMMENT
0
Entering edit mode

For Ensembl there is no dedicated API way that I know of. If you are specifically interested in bacteria from Ensembl genomes here is a hackish script you can adapt.

ADD REPLY
0
Entering edit mode
6.8 years ago
tdmurphy ▴ 230

This is easily accomplished from NCBI's Assembly resource: https://www.ncbi.nlm.nih.gov/assembly/?term=bacteria%5Borgn%5D+latest_refseq%5Bfilter%5D+complete_genome%5Bfilter%5D You can download FASTA, annotation, or other files using the big blue "Download Assemblies" button.

Note "complete genome" is a useful filter for bacteria, but there are only a handful of eukaryote assemblies that are sequenced to completion (mostly fungi). If you're interested in eukaryotes you may want to either focus on assemblies at the "chromosome" level (to exclude WGS assemblies that are just bags of scaffolds), or use the "exclude partial" filter to exclude the small number of assemblies that are focused on a subset of the genome (e.g. just one chromosome).

ADD COMMENT

Login before adding your answer.

Traffic: 1500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6