Hi Biostars!
I am sorry if my question is basic. I would like to download the fasta files for the strains below:
https://www.ncbi.nlm.nih.gov/genome/genomes/539
Can I do this automatically?
Thanks for you help.
Hi Biostars!
I am sorry if my question is basic. I would like to download the fasta files for the strains below:
https://www.ncbi.nlm.nih.gov/genome/genomes/539
Can I do this automatically?
Thanks for you help.
Use the wrapper package ncbi-genome-download: https://github.com/kblin/ncbi-genome-download As easy as:
ncbi-genome-download --genus "Streptomyces coelicolor" bacteria
Hi Sart!!
I use the following workflow to download a large set of Bacterial genomes from NCBI. This helped me to even download 1000s of genomes.
Execute this workflow stepwise:
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt (download the complete list of manually reviewed genomes (RefSeq database, a subset of GenBank))
grep -E 'Acinetobacter.*baumannii' assembly_summary_refseq.txt | cut -f 8,9,14,15,16
grep -E 'Acinetobacter.*baumannii' assembly_summary_refseq.txt | cut -f 20 > ftp_folder.txt
head ftp_folder.txt
awk 'BEGIN{FS=OFS="/";filesuffix="genomic.fna.gz"}{ftpdir=$0;asm=$10;file=asm"_"filesuffix;print "wget "ftpdir,file}' ftp_folder.txt > download_fna_files.sh
awk 'BEGIN{FS=OFS="/";filesuffix="genomic.gff.gz"}{ftpdir=$0;asm=$10;file=asm"_"filesuffix;print "wget "ftpdir,file}' ftp_folder.txt > download_gff_files.sh
head download_fna_files.sh
source download_fna_files.sh
ls
gzip -d *.gz
ls
head -1 *.fna
Go to https://www.ncbi.nlm.nih.gov/genome/539 and click on the 'Assembly' link on the right hand side. It will take you to the Assembly page with the 8 linked assemblies. On this page, click on the blue 'Download Assemblies' button at the top and choose source and file type to download data.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you very much for the workflow.