--preserve_environment with docker containers?
2
0
Entering edit mode
7.2 years ago
Biowoogles ▴ 20

Hi there,

Is --preserve_environment ENVVAR meant to work with Docker containers?

If I print os.environ from within a container launched as a result of a dockerRequirement then I find that my environment variable is not being preserved.

Bug or Working as intended?

Note: I left the same comment over at https://github.com/common-workflow-language/cwltool/issues/196

Thanks!

cwltool Common-Workflow-Language cwl • 1.8k views
ADD COMMENT
0
Entering edit mode
7.2 years ago

Hello Biowoogles,

For (Docker) containers you should use other methods:

1) either set the environment variable as part of the container creation (via Dockerfile or some other method)

  • or -

2) Set the environment variable in the CommandLineTool, workflow step, or Workflow level using EnvVarRequirement in the requirements or hints section as appropriate: http://www.commonwl.org/user_guide/12-env/

ADD COMMENT
0
Entering edit mode
7.2 years ago
Biowoogles ▴ 20

Hi Michael,

Thank you very much for your answer.

Is there any specific reason for why this is the case? Would you potentially accept a PR to enable this?

Kind regards!

ADD COMMENT
0
Entering edit mode

One of the goals of using a workflow language like the CWL standards is to be explicit and portable -- and software containers like Docker help a great deal with that.

The reference runner's --preserve_environment flag was added to aid developers and is useful when running in places where software container technology like Docker isn't available and site-specific settings (like PATH and CLASSPATH) need to be set.

For example, see the scripts I wrote for running the EBI Metagenomics pipeline without Docker on their cluster: https://github.com/ProteinsWebTeam/ebi-metagenomics-cwl/blob/master/workflows/standalone.sh#L34 & https://github.com/ProteinsWebTeam/ebi-metagenomics-cwl/blob/master/workflows/ebi-setup.sh#L27

So if one is using Docker then you don't need this local configuration hack -- your containers or tool should be explicit about setting the needed value(s). Especially as --preserve-environment isn't part of the standard interface for cwl-runners (it is specific to the CWL reference runner, cwltool).

Does that make sense?

ADD REPLY
1
Entering edit mode

It does make sense and I wholeheartedly agree on the points of being explicit and portable.

However, my use case was actually to be as a means of tracking the docker containers spawned by a specific workflow, perhaps to perform monitoring or otherwise, so I was hoping to be able to inject an environment variable so that I could distinguish those containers from others which might be alive on the same machine, e.g. a shared use server style environment. This would all be happening on a layer above the workflow.

Hence, I think my use case is actually outside the confines of the philosophical constraints of CWL, and as such at least for me it would be a really nice little feature and would not be part of a broader pattern of mis-use of the framework! However, I understand that this could be a gateway to hackish behaviour, and as such you might not want to include it.

Thanks again for your feedback!

ADD REPLY
0
Entering edit mode

What you seek (site specific configuration that doesn't affect the analysis itself) is reasonable, and a feature that one might add to a real workflow management system (which the reference CWL runner is not :-)

Perhaps you could add this feature to Toil or another system with CWL support? http://www.commonwl.org/#Implementations

ADD REPLY

Login before adding your answer.

Traffic: 2278 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6