I wanted to know how I can launch prokka for a folder that has several .fasta files of several genomes annotated thank you for the idea.
I wanted to know how I can launch prokka for a folder that has several .fasta files of several genomes annotated thank you for the idea.
I'm assuming that you want to annotate each fasta file separately. If that's correct, then you should be able to do this relatively easily with gnu parallel.
According to the github page, the simplest prokka usage is just:
prokka <inpute fa file>
Therefore, if you want to run prokka on several input fasta files simultaneously, you could do this with gnu parallel. For example (assuming the fasta files are in your current working directory):
ls *.fasta | parallel --verbose "prokka {} --prefix {.}_out"
In the above command, each fasta file name is piped to parallel, which will launch a a separate prokka analysis for each of those fasta files. The output file names will be based on the input fasta file names, with the ".fasta" extension removed. The "--verbose" flag will print the prokka command for each input fasta file to the screen, which makes it easier to understand what exactly is going on.
Note that I have not tested the above command, so you might consider adding the "--dry-run" flag. This will print out the commands to be run without actually running them.
You can find many great gnu parallel examples here.
Of course, if you didn't actually want to annotate these genomes separately, then the above approach will not be what you want.
A similar topic was discussed couple of days ago - see here.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What have you tried? How does prokka take a FASTA file as an input argument? Can multiple files be provided? If one file is expected, can process substitution be used? You should ask and exhaust these questions yourself.
Now I'm intrigued about how you intend to use process substitution for this...
If only one file is expected with a parameter, say
-f
, you can use-f <(cat file1 file2 file3)
, and that is how process substitution can be of value here. The point of my comment was to get OP to think and invest some effort on how to solve their problem.Ok makes sense, but that would mix them, which may be reasonable or not, depending on what's in the files. Was just curious.