Hi everyone. I’m looking at describing a workflow for (initially) checking the quality of genome sequences. I have written a series of Python scripts that do things like making decisions on parts of the workflow to run, calling command line tools, passing files to the next stages, etc, that would appear to be easier to do in CWL. However, I’m having some trouble with figuring out how to do the non-basic stuff. The User Guide gives nice examples of individual stages, and a simple workflow, but it’s hard to figure out all of the options that are possible.
For example, I think the scatter option will allow me to provide a series of input files, and it will perform the same steps in a workflows on each file. Later on, I want to then check ALL of the outputs to make sure they pass sanity checks before moving on to later stages (not yet defined). Do you have a workflow example that describes this? It’s not overly clear how it works exactly. I also have a step that might need repeating, depending on the output of later stages. In my example, FastQC is run on a file, and the output is checked for warn/fail flags depending on the defined tolerances. Some files may need to go through a quality trim stage, and then go back through the FastQC and checking stages, with second runthroughs deciding if a borderline original file is now okay to pass through. Is that kind of automated decision making possible in CWL?
Hi Michael,
Thanks for the reply. In the case that tolerances fail, your solution would be fine. However, there are "maybe" tolerances, where a stage of trimming is applied, and then needs to go back through the previous stages. I shall have to have a think whether moving to CWL is worthwhile for the use case I currently have. In theory it should make it easier for users to alter the workflow to their needs, but the amount of work put in to get there might be too much given this is simply a pilot use case. I'm currently using cwltool to run the workflows, and just trying to figure out how to define the workflows in CWL. As things currently stand, I have non-CWL, python-only scripts that works as expected, and maybe the workflow is too complex to move over without too much additional work in a finite amount of project time left. I will get back to you if I figure it out.
Cheers
Darren