Hi guys,
I've developed a pipeline for analysis of NGS data on the HPC ystem at my university. I'd now like to release it and make it as easy as possible for others to use. The first things that came to my mind were Docker and Singularity—but I know little about each.
What I do know is that Docker at least, requires sudo
and has some issues running on HPC systems. Would my workflow then be to get the pipeline working on my local machine, and build the image there? If I did this, would people on other HPC systems even be able to use the image?
As an aside, the Docker tutorial that I completed configures a Flask app—is it possible to avoid this and just have the container set up the right environment, so that my pipeline can be run command-line with the user's input? Would I just leave out the EXPOSE <port>
in the Dockerfile? Is this bad practice?
Thanks.
This is not really a bioinformatics question per se.
Yes, in my experience, docker isn't very friendly in many HPC environments. If you want to package your pipeline for easier installation and use, I'd recommend putting your pipeline together using one of the many workflow management systems (nextflow, snakemake, etc). Environments and dependencies can be managed by platform like conda/bioconda.
Hope this helps.
To make it even easier to use, import it into Galaxy if that's possible.