Docker: Pass reads file form host to container
2
0
Entering edit mode
12 months ago
davidmaimoun ▴ 50

Hello, I need your help please.

I want to create docker container to analyse my reads, to run it via nextflow or wdl worflow. I have a python script, and I would like to pass my reads as argument something like that : "myScript.py args" I don't know how to run a docker container, passing files from host to the container.

I don't want to copy my files into the container, but only pass my different reads dynamically to the container and run my script on them

reads = sys.argv[1]
for reads in reads:
    analyse(read)

My Dockerfile:

FROM python:3.8.3
WORKDIR /application
COPY src/ .
RUN useradd appuser && chown -R appuser /application
USER appuser
ENTRYPOINT ["python", "app.py"]

Thank you !

docker • 2.0k views
ADD COMMENT
1
Entering edit mode

Cant't you just do this? Possibly need to change the bind directory source from /data to wherever the data is?

docker exec --mount type=bind,source=/data,target=/data -it my_container python3 myScript.py /data/inputfile

Caveat, I am using singularity not docker, so I couldn't test the syntax but it should be quite close.

ADD REPLY
1
Entering edit mode

If you're using nextflow, docker container mount the /tmp directory in the container by default. There are some options here, but typically everything in the process working directory are mounted to /tmp. Can you add an input path parameter to you script? Something like python app.py -I /tmp?

I usually use bash based docker containers, so I'm unfamiliar with any specific problems or how paths work in python containers. But in testing, so long as docker run -v /path/to/reads:/tmp my_container works, it should be find on nextflow.

ADD REPLY
1
Entering edit mode
12 months ago
zorbax ▴ 650

Regardless Nextflow you can define your WORKDIR wherever you want and then mount the volume:

docker run -v $(pwd):/application test random.txt

E.g. file: random.txt

ojhdbfvphwdf
qergqwtyh1qjh
wrtynwqrynn
qwtghjnqryj

Your script read_file.py

import sys

filename = sys.argv[1]

with open(filename) as file:
    lines = [line.rstrip() for line in file]
    print(lines)

And your Dockerfile built with docker build -t test -f Dockerfile:

FROM python:3.8.3

WORKDIR /application

COPY read_file.py /opt
RUN mkdir -p /application
ENTRYPOINT ["python", "/opt/read_file.py"]
ADD COMMENT
0
Entering edit mode

Thank you for your help,

It's working locally, which is very helpful. It is not when I'm trying to run it via WDL. I tried to mounted my script in /tmp but it's not working with WDL, or I missed something I'll dive in the docs deeper in hope to find out a clue

Thank you very much

ADD REPLY
1
Entering edit mode
12 months ago

If you are using Nextflow, you won't need to make many changes to work with your container. Build the container first.

Then configure Nextflow for docker following these instructions:

https://www.nextflow.io/docs/latest/container.html#docker

I use Nextflow with singularity routinely (write a docker Dockerfile, then convert that to a singularity image).

Generally, you can use the python script in a nextflow process normally.

Eg this should work, if you create a bin directory in the directory where your main.nf is running, with the python script in it.

myScript.py infile.txt
ADD COMMENT
0
Entering edit mode

Thank you Colin !

But something doesn't fit in my case, I affraid I've mistaken somewhere.

Let's said I want for my workflow, run a docker python script "hello.py" -very simple- create me a "hello.txt" with a greeting message in it.

Does I need a linux distribution, and save my script in /bin in order to be called by my wdl/nextflow workflow.

In my Dockerfile I wrote the command: CMD ["python3", "helloy.py"], and I thought the script would be called in the runtime, but it doesn't work (docker: "[myDockerHubId]/test_hello:1" ).

Clearly I don't fully understad how it is working. I'll be glad if you can send me a simple example of the script, Dockerfile, and the command to run it in wdl/nextflow.

Thank you for your help!

ADD REPLY
0
Entering edit mode

Well, that's not at all what I said to do, but anyway.

With nextflow, just ignore Docker for a bit, use the system python, and try to get something simple to work.

This can be as simple as this example:

https://github.com/nextflow-io/patterns/blob/master/docs/process-per-file-path.md

Try to get that working, then change the process or add a new one to run your python script. Work from there. Docker is more intended for installing packages which your scripts depend on, eg with apt, conda, pip or micromamba. You actually run the tools in Nextflow.

ADD REPLY
0
Entering edit mode

Thank you Colin I'll work on it

ADD REPLY

Login before adding your answer.

Traffic: 2658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6