Question

Iterate PartitionFinderProtein for multiple files containing protein alignments

0

Entering edit mode

10.5 years ago

dago ★ 2.8k

I have multiple files containing each a protein alignment.

I would like to use PartitionFinderProtein on all of them, but I would like to iterate it because I cannot call the function on more than 5000 files.

Is that possible at all?

The first problem is that it needs the .cfg file that define the data_blocks to be considered in the sequence.

Is someone aware of a way to use always the same .cfg file, telling the program that it has to consider always the full length of the seq?

Moreover, it saves the file in a folder that the programs call analysis. Is there a way to select the output dir or at least its name?

thanks for the help.

iteration phylogeny protein • 1.7k views

ADD COMMENT • link 10.5 years ago by dago ★ 2.8k

score 0 · Answer 1 · 2015-01-23

The tool looks like any other command line application. Assuming your locations are programmable, you should have no problems automating it.

Look into UNIX loops and try and figure out which inputs change per run and which ones remain constant. Then, make the ones that change transition in a predictable fashion (like put them all in the same folder or collect their locations in a single text file). Once you analyze this, you can write a loop to keep the constant guys constant and substitute in the variables for each run appropriately.

We can help you out, but this looks like it's gonna take some time to get the details figured out.

EDIT-1: Have multiple "Armadillo"-type folders, one per run. That way, the analysis folders will be created within each "Armadillo"-type folders. Armadillo-type here refers to the folder your data files will reside in, just like the "Armadillo" folder from the example.

EDIT-2: Let's say you have 5 folders, ARM-1,ARM-2,ARM-3,ARM-4 and ARM-5, each with an input file of 4000 sequences. Ensure all these ARM 1 to 5 folders are in the same folder (let's call this ARM-PARENT). On Terminal, navigate to ARM-PARENT and run a loop on each ARM-X folder within. Change the second argument to the python script. Also, you might wanna paste the same config file in each ARM-X folder so the config remains constant across runs.