Hi there,
Is --preserve_environment ENVVAR
meant to work with Docker containers?
If I print os.environ
from within a container launched as a result of a dockerRequirement then I find that my environment variable is not being preserved.
Bug or Working as intended?
Note: I left the same comment over at https://github.com/common-workflow-language/cwltool/issues/196
Thanks!
One of the goals of using a workflow language like the CWL standards is to be explicit and portable -- and software containers like Docker help a great deal with that.
The reference runner's
--preserve_environment
flag was added to aid developers and is useful when running in places where software container technology like Docker isn't available and site-specific settings (like PATH and CLASSPATH) need to be set.For example, see the scripts I wrote for running the EBI Metagenomics pipeline without Docker on their cluster: https://github.com/ProteinsWebTeam/ebi-metagenomics-cwl/blob/master/workflows/standalone.sh#L34 & https://github.com/ProteinsWebTeam/ebi-metagenomics-cwl/blob/master/workflows/ebi-setup.sh#L27
So if one is using Docker then you don't need this local configuration hack -- your containers or tool should be explicit about setting the needed value(s). Especially as
--preserve-environment
isn't part of the standard interface forcwl-runner
s (it is specific to the CWL reference runner,cwltool
).Does that make sense?
It does make sense and I wholeheartedly agree on the points of being explicit and portable.
However, my use case was actually to be as a means of tracking the docker containers spawned by a specific workflow, perhaps to perform monitoring or otherwise, so I was hoping to be able to inject an environment variable so that I could distinguish those containers from others which might be alive on the same machine, e.g. a shared use server style environment. This would all be happening on a layer above the workflow.
Hence, I think my use case is actually outside the confines of the philosophical constraints of CWL, and as such at least for me it would be a really nice little feature and would not be part of a broader pattern of mis-use of the framework! However, I understand that this could be a gateway to hackish behaviour, and as such you might not want to include it.
Thanks again for your feedback!
What you seek (site specific configuration that doesn't affect the analysis itself) is reasonable, and a feature that one might add to a real workflow management system (which the reference CWL runner is not :-)
Perhaps you could add this feature to Toil or another system with CWL support? http://www.commonwl.org/#Implementations