Hi, We have all reference bacterial genome in our server under A,B,C,D.....Z subfolder. Each of them contains subfolder for each bacterial genome. Under that subfolder fasta file is situated. For example: ~/bacterial_genome/P/Pseudomonas_genome1/genome1.fna Now, I want to know the resistance status of each genome by running resfinder. I concatenated the whole database but due to large size of the input file (~430 gb) our server killed the job. So, I downloaded GNU parallel to handle it. I need to run resfinder 15 times for each of the genome ( because the parameter -a will change every time with different resistance phenotype).
Generally the command for one genome is following: cd ~/resistance mkdir aminoglycoside cd aminoglycoside export PATH=$PATH:/usr/local/bin/blast-2.2.26/bin resfinder.pl -d /Volumes/scratch/databases/resfinderdb/ -i ~/bacterial_genome/P/Pseudomonas_genome1/genome1.fna -a aminoglycoside -k 90.00 -l 0.60
Main problem is this software create results file in the current directory and if the program is run again in the same directory, it deletes previous and save new one.
I am looking for a solution to write script in parallel for running the resfinder software for each genome .
Thanks
Hi, I rewrite my command and use while loop. Could you please send me a GNU parallel solution for it? cat ./fna.ls | while read i j; do mkdir -p ./${j%.} perl ~/res/resfinder.pl -d ~/res/resfinderdb -i ${i} -a all -k 90.00 -l 0.60 -o ./${j%.} done