I have over 900 assembled genomes which I want to annotate using prokka. Previously I have annotated few sequences by running each one by one, but with 900 I need a python script to automate the process.
Can anyone share a script that can be adopted to do my annotation, please.
I second genomax suggestion to do this in shell scripts. Assuming your files have .fna extension and that you have bash shell:
for i in *.fna
do
prokka --outdir ${i%.fna} --force --prefix ${i%.fna} --locustag ${i%.fna} --rfam --cpus # $i
done
If you have C-shell:
foreach i ( *.fna )
prokka --outdir $i:r --force --prefix $i:r --locustag $i:r --rfam --cpus # $i
end
If all of them are the same genus and/or kingdom, you could add --genus and --kingdom switches as well. You may want to remove --rfam if not interested in non-coding RNAs because that part of the search is slowest. Also, specify a number instead of # after the --cpus switch.
Why do you think you need python? Since this is a straight command line you should be able to do this via a shell script.