Please does anyone know how I can build a custom Prokka database particularly for the Caudovirales?
I want to be able to
say Prokka --usegenus --genus Caudovirales.
Thanks
Please does anyone know how I can build a custom Prokka database particularly for the Caudovirales?
I want to be able to
say Prokka --usegenus --genus Caudovirales.
Thanks
Hello, this is how I build a custom Prokka DB i.e. for genus Bacillus:
Download all the available assemblies for genus Bacillus from NCBI (GenBank, Genomic GenBank format); uncompress the downloaded package then in the resulting assemblies folder run:
$ gunzip *.gz
$ prokka-genbank_to_fasta_db *.gbff > Bacillus.faa
$ cdhit -i Bacillus.faa -o Bacillus -T 0 -M 0 -g 1 -s 0.8 -c 0.9
$ makeblastdb -dbtype prot -in Bacillus
$ mv Bacillus Bacillus.p?? <your-path-to>/prokka/db/genus/
$ prokka --setupdb
Hth, and Happy 2022!
Thanks for the reply.
I have a number of .gbff.gz files. so I ran
prokka-genbank_to_fasta_db *.gbff.gz > sample.faa
or prokka-genbank_to_fasta_db GCF_* > sample.faa
But sample.faa is empty.
What is the best command line to get prokka-genbank_to_fasta_db*.gbff.gz > sample.faa to work?
Thanks
The command most likely doesn't work with gzipped files. You may need to do this first:
gunzip *.gbff.gz
That command will unpack the files and remove the .gz
extension from them. After that a slightly modified command (no .gz
) should work:
prokka-genbank_to_fasta_db *.gbff > sample.faa
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I think instead of
it should be
This is also assuming that
prokka
is installed in one's home directory, which certainly does not hold universally. That means one may have to enter the actual location ofprokka
directory instead of~/prokka
Yes, it is assuming that prokka is instalIed in home, and the path should adjusted to the actual prokka/db/genus path. I also agree with your second point. I'll edit my previous aswer accordingly.
Thank you for the clarification.