Speed up PHASE software for haplotype inference
1
0
Entering edit mode
8.9 years ago
kshitijtayal ▴ 40

I am a Bioinformatician and recently stuck in a problem which requires some scripting to speed up my process. We have a software called PHASE and Command that I type in my command line to fire software is

./PHASE test.inp test.out

where PHASE is the name of the program and test.ip is the input file and test.out is the output file.It takes one core to run the above process which takes approx 3 hours to complete.

Now I have 1000 of input files say test1.inp,test2.inp,test3.inp,... and so on to test1000.inp and want to generate all 1000 output files: test1.out,test2.out,.., test100.out using full capacity of my system which has 4 cores.

To use full capacity of my system I want to fire 4 instance of the above script that takes 4 input files like this and generate 4 different outputs

./PHASE test1.inp test1.out
./PHASE test2.inp test2.out
./PHASE test3.inp test3.out
./PHASE test4.inp test4.out

After each job is finished and output file has been generated the script should again fire up the remaining input files until all are over.

./PHASE test5.inp test5.out
./PHASE test6.inp test6.out
./PHASE test7.inp test7.out
./PHASE test8.inp test8.out

and so on..

How do I write the script for the above process where the script takes advantage of 4 cores and speed up my process?

multithreading unix haplotype • 1.9k views
ADD COMMENT
0
Entering edit mode
8 months ago
WANG • 0

A GNU tool parallel might be a choice.

ADD COMMENT

Login before adding your answer.

Traffic: 1599 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6