Are there any working GNU parallel or similar Shell code examples on how to run Pilon, Prokka, VT and Snpeff tools on a batch of input files?
Are there any working GNU parallel or similar Shell code examples on how to run Pilon, Prokka, VT and Snpeff tools on a batch of input files?
Can you be more specific?
What kind of files/what file structure?
The format for parallel
is pretty much the same for whatever you want to do:
parallel 'mycommand {} {.}.ext' ::: inputfile(s)
The command is simply whatever you would normally invoke the program by. Here's a real example of running a python script:
parallel --gnu 'radar.py -a {} > {.}.radar' ::: *.faa
the {}
simply means an input file, and {.}
means the input filename with the extension stripped off, so that you can then append your own to identify your output files.
alternatively, you can pipe it from ls
. e.g.
ls *.gz | parallel --gnu gunzip &
Just be careful to manage the threads you launch, as most of those tools will also support a threading option. If you invoked as many parallel
jobs as you have cores, and then the process you invoke tries to multithread as well, you can quickly cause CPU thrashing, and the job won't end up running any faster.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi england_bioinformatics_team,
The manual of gnu parallel is quite extensive, have you had a look at that? See also this post: Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them
Cheers,
Wouter
while gnu parallel is super powerful to parallelize commands, what you might be looking for is something even more powerful like snakemake, which is designed to run workflows