Is there a quick way to extract all bacteria sequences from NCBI non-redundant (NR) database using blastdbcmd
?
This command applies only to extract all human sequences from the nr database. How about in microbial sequences case?
$ blastdbcmd -db nr -entry all -outfmt "%g %T" | \
awk ' { if ($2 == 9606) { print $1 } } ' | \
blastdbcmd -db nr -entry_batch - -out human_sequences.txt
BLAST+ 2.8.1 with blastdb v5 allow you to limit your search by taxonomy using information built into the BLAST databases. So you don't need to build blastdb for specific taxids now !
thank you very much for replies