I've been using this method to rename files at the end of a step:
processed_fastq_1:
type: File
outputBinding:
glob: ${ return '**/' + inputs.fastq1.basename }
outputEval: |
${
self[0].basename = inputs.add_rg_SM + '_R1.fastq.gz';
return self[0]
}
But I'm not sure if this is an acceptable method. Should there be an intermediate step using python or calling to a script to rename files in the middle of a CWL workflow?
I think either method would be acceptable and could come with their own benefits/pitfalls. For example, one benefit of using an intermediate renaming step is that it would be compatible with a greater number of execution engines such as Cromwell, cwlexec, snakemake etc.. By creating an intermediate step you might eliminate some of the issues that the engine would impose. I am speaking only from experience with CWLExec, but I know that command would not be interpreted by cwlexec for Job Submissions to a cluster.
The obvious benefit of this is if you have an engine from which you know it works, you save an unnecessary intermediate step.
cwlexec
does supportInlineJavascriptRequirement
s so there is no compatibility issue with the asker's proposal.If you've experienced a problem with
cwlexec
then that should be reported to https://github.com/IBMSpectrumComputing/cwlexec/issues?q=is%3Aopen+is%3Aissue and to your IBM support contact.I have been working with IBM quite a bit on bug fixes in regards to what they support and don't support. I work with CWLexec every day in our lab to develop pipeline automation. In theory, they support InlineJavascriptRequirement and ShellCommandRequirement but in practice CWLexec is quite picky. If you do not use cwlexec this convo doesn't really matter :).
Is there any way to do this after the fact in an ExpressionTool?