Entering edit mode
4.1 years ago
anran04100
•
0
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/assembly_summary.txt
I download the assembly_summary.txt
awk -F '\t' '{if($12=="Complete Genome") print $20}' assembly_summary_path.txt > assembly_summary_complete_genomes_path.txt
select Complete Genome to a new file which save the path of bacteria
but it turns out that the path file includes 21272 rows I wonder if there should be 3000+ rows since there are 3000+ bacteria in NCBI What's wrong with it? How can I download all the bacteria genomes from NCBI?
Thanks!
http://blog.shenwei.me/manipulation-on-ncbi-refseq-bacterial-assembly-summary/