I have 10,000 genome.For analyzing each genome, the following software takes 2/3 minutes. I am using the following loop and I think will take ~ a month to analyze my data . I am looking forward a faster way. e.g using parallel.
How to fit the loop in parallel? or any other suggestions?
cat fna.ls | while read i j; do
mkdir -p ~/jobs_resfinder/${j%.*}
perl ~/res/resfinder.pl -d ~/res/resfinderdb -i ${i} -a all -k 90.00 -l 0.60 -o ~/jobs_resfinder/${j%.*}
done
I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:
In addition, I converted this thread to a "Question". "Tool" should only be used for announcing new tools.
Thanks. I have no coding background and struggle a lot with it. I googled a lot, but can't solve problem for this one. So, looking for expert solution !
Because of parallel -n 2 restFinderFunction gets two args. To the function they're $1 and $2. You don't need to reassign them to i and j. You can use them directly as well. What goes for running the script, you simply save it, chmod +x and just execute it: ./script.sh ..don't call it with parallel
You can monitor stuff with e.g. htop. If IO is the bottle neck then running in parallel will do you little good..
I tried your script. It can generate a directory but that is empty. And it also produces other directory named " Network". I can't figure out the reason.The main problem is it can't execute the Perl script. So, no output in the directory.
Paste out-put of
cat fna.ls
These is ~10,000 . I paste only 2
reformat the post according to below post
I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:
In addition, I converted this thread to a "Question". "Tool" should only be used for announcing new tools.