Creation of Fasta Database of Bacterial Proteins
0
0
Entering edit mode
7.7 years ago
rlee03 • 0

Hi,

I tried creating a Fasta Database using an Entrez Query but I consistent get booted from the NCBI server due to the large request. Can you recommend another way to download all the proteins bacteria in a fasta file that should work? n=5.8 million

Thanks!

I'm a python programmer and prefer to do this via the command line

blast blastp proteins fasta • 2.1k views
ADD COMMENT
1
Entering edit mode

You can download many (all?) bacterial genomes here: ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria Getting protein sequences from bacterial genomes is most probably trivial for you ...

ADD REPLY
0
Entering edit mode

You could get all RefSeq proteins by downloading the faa protein files here.

Get the protein files from folder hierarchy from link posted by @Protostome: ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/name_0f_bacteria/latest_assembly_versions/GCA_*/*faa.gz files.

ADD REPLY

Login before adding your answer.

Traffic: 1266 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6