Download All The Bacterial Genomes of Pseudomonas aeruginosa From Ncbi
3
0
Entering edit mode
5.5 years ago
Optimist ▴ 190

Dear members of Biostar!!

Greetings

I would like to download all the genomes available with regards to Pseudomonas aeruginosa species from NCBI.

Kindly let me know the way to download all the 4761 genomes for species from NCBI (link).

Thanks

wgs ngs bacterial Genomics • 1.4k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
3
Entering edit mode
5.5 years ago
vkkodali_ncbi ★ 3.8k

Since all the other answers appear to be command-line based, here's a point-and-click method. Follow the link to the NCBI Genomes page you have provided in your post (https://www.ncbi.nlm.nih.gov/genome/?term=pseudomonas%20aeruginosa) and click on the 'Assembly' link in the 'Related Information' panel on the right hand side. You will be directed to the NCBI Assembly page (https://www.ncbi.nlm.nih.gov/assembly?LinkName=genome_assembly&from_uid=187) where you will find a blue 'Download Assemblies' button. You can use the filters on the left hand side to further filter your data if you like. From the Download button, choose the source RefSeq or GenBank and the file type of interest to you.

ADD COMMENT
2
Entering edit mode
5.5 years ago
AK ★ 2.2k

You can try this way:

curl -s "ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/assembly_summary.txt" \
  | awk -v FS="\t" '$8~/Pseudomonas aeruginosa/{print $20}' \
  | sed -r 's|(ftp://ftp.ncbi.nlm.nih.gov/genomes/all/.+/)(GCA_.+)|\1\2/\2_genomic.fna.gz|' \
  > asm_list.txt

wget -i asm_list.txt

Where asm_list.txt contains the locations of those genomes:

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/006/765/GCA_000006765.1_ASM676v1/GCA_000006765.1_ASM676v1_genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/014/625/GCA_000014625.1_ASM1462v1/GCA_000014625.1_ASM1462v1_genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/017/205/GCA_000017205.1_ASM1720v1/GCA_000017205.1_ASM1720v1_genomic.fna.gz
......

You can check the full information of these assemblies by keeping the results of:

curl -s "ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/assembly_summary.txt" | awk -v FS="\t" '$8~/Pseudomonas aeruginosa/{print}'
ADD COMMENT
1
Entering edit mode
5.5 years ago
Joe 21k

As genomax's comment alluded to, you can follow the approach of using ncbi-genome-download from Kai Blin.

There are a few examples here you can also try: A: Easiest way to download all Enterobacteria

ADD COMMENT

Login before adding your answer.

Traffic: 2515 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6