Question

How to download fasta files of a particular genus bacteria available in NCBI?

0

Entering edit mode

5.3 years ago

sart ▴ 10

Hi Biostars!

I am sorry if my question is basic. I would like to download the fasta files for the strains below:

https://www.ncbi.nlm.nih.gov/genome/genomes/539

Can I do this automatically?

Thanks for you help.

NCBI • 2.0k views

ADD COMMENT • link updated 5.3 years ago by Asaf 10k • written 5.3 years ago by sart ▴ 10

score 2 · Answer 1 · 2020-01-13

2

Entering edit mode

5.3 years ago

Asaf 10k

Use the wrapper package ncbi-genome-download: https://github.com/kblin/ncbi-genome-download As easy as:

ncbi-genome-download --genus "Streptomyces coelicolor" bacteria

ADD COMMENT • link 5.3 years ago by Asaf 10k

score 1 · Answer 2 · 2020-01-13

1

Entering edit mode

5.3 years ago

JC 13k

Use Entrez to get them

ADD COMMENT • link 5.3 years ago by JC 13k

Istvan Albert · Answer 3 · 2020-01-13

Hi Sart!!

I use the following workflow to download a large set of Bacterial genomes from NCBI. This helped me to even download 1000s of genomes.

Execute this workflow stepwise:

wget ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt (download the complete list of manually reviewed genomes (RefSeq database, a subset of GenBank))

grep -E 'Acinetobacter.*baumannii' assembly_summary_refseq.txt | cut -f 8,9,14,15,16

grep -E 'Acinetobacter.*baumannii' assembly_summary_refseq.txt | cut -f 20 > ftp_folder.txt

head ftp_folder.txt 

awk 'BEGIN{FS=OFS="/";filesuffix="genomic.fna.gz"}{ftpdir=$0;asm=$10;file=asm"_"filesuffix;print "wget "ftpdir,file}' ftp_folder.txt > download_fna_files.sh

awk 'BEGIN{FS=OFS="/";filesuffix="genomic.gff.gz"}{ftpdir=$0;asm=$10;file=asm"_"filesuffix;print "wget "ftpdir,file}' ftp_folder.txt > download_gff_files.sh

head download_fna_files.sh 

source download_fna_files.sh 

ls

gzip -d *.gz

ls

head -1 *.fna

Source

score 1 · Answer 4 · 2020-01-13

1

Entering edit mode

5.3 years ago

vkkodali_ncbi ★ 3.8k

Go to https://www.ncbi.nlm.nih.gov/genome/539 and click on the 'Assembly' link on the right hand side. It will take you to the Assembly page with the 8 linked assemblies. On this page, click on the blue 'Download Assemblies' button at the top and choose source and file type to download data.

ADD COMMENT • link 5.3 years ago by vkkodali_ncbi ★ 3.8k