Using the Directory type in Common Workflow Language
1
2
Entering edit mode
8.4 years ago
Peter vH ▴ 130

In trying to solve a problem posed by Stuti Agrawal, I've been trying to make a working example using the Directory type of CWL v1.0 as an input and not getting very far. I tried writing a tool that pics up the files from a Directory and adds them to a zip file. Note, this doesn't add the directory, but rather its contents. So if you have:

/home/foo/one.txt
/home/foo/two.txt

it should add one.txt and two.txt to the command line, i.e:

zip data.zip one.txt two.txt

I'm not sure if this is the intended use case for Directory? Alternatively, I'd like to able to add /home/foo to the command line, yielding:

zip -r data.zip /home/foo

In any event here is a non-working experiment:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0

requirements:
    - class: InlineJavascriptRequirement

class: CommandLineTool

baseCommand: ["zip"]

inputs:
  testdir:
    type: Directory
    doc: |
        string(s): list files in a directory.
    inputBinding:
        position: 1
outputs:
  zipped_file:
    type: File
    outputBinding:
      glob: data.zip

arguments:
  - valueFrom: data.zip

What should be changed to get either my first or my second example command line?

Thanks

cwl • 6.1k views
ADD COMMENT
2
Entering edit mode

Hi Peter,

I think this is a bug. It's supposed to add the directory path to the command line, but that doesn't seem to be happening here. I'll look into it.

ADD REPLY
2
Entering edit mode

To follow up, I have a PR which correctly adds the Directory path to the command line as intended:

https://github.com/common-workflow-language/cwltool/pull/144

ADD REPLY
0
Entering edit mode

By the way I used this input to the above CWL:

testdir:
  class: Directory
  location: file:///home/pvh/Documents/code/cwltool/foo
ADD REPLY
0
Entering edit mode

Hello Peter vH!

We believe that this post does not fit the main topic of this site.

Not a bio-informatics question, have a look at stackoverflow.com

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLY
2
Entering edit mode

I don't know, the common workflow language is sort of an up-and-coming important aspect of reproducible bioinformatics.

ADD REPLY
0
Entering edit mode

Well in that case, forgive my ignorance :) (reopened)

ADD REPLY
0
Entering edit mode

As Devon Ryan noted, CWL is a (now) stable specification describing workflows, and started from the need for documented reproducible bioinformatics workflows. It had a significant presence at the Bioinformatics Open Source Conference the last two years and just released a v1.0 specification.

https://www.open-bio.org/wiki/BOSC_2015_Schedule https://www.open-bio.org/wiki/BOSC_2016_Schedule

ADD REPLY
0
Entering edit mode

Peter vH it might be worth posting this to the CWL mail list:

https://groups.google.com/forum/#!forum/common-workflow-language

You might get a more direct answer there.

ADD REPLY
1
Entering edit mode

Some of the CWL folks have directed "support" questions to Biostars recently on the understanding that when they get answered here the answers are easy for people to find in the future. I'll alert the community to this question though.

ADD REPLY
0
Entering edit mode

Ah, didn't know that but using Biostars makes sense. I just haven't seen many CWL-related queries here yet.

ADD REPLY
1
Entering edit mode
ADD REPLY
2
Entering edit mode
8.4 years ago

With thanks to Peter's patch to cwltool, here is a working variation:

#!/usr/bin/env cwl-runner

class: CommandLineTool
cwlVersion: v1.0

requirements:
 - class: InitialWorkDirRequirement
   listing: $(inputs.directory_to_zip.listing)     

inputs:
  directory_to_zip:
    type: Directory

baseCommand: [zip, "--recurse-paths", "-", "."]

outputs:
  zipped_file:
    type: stdout
    format: application/zip

Sample usage with command line input:

$ ls t
bam  bam.bai  three
$ cwltool zipper.cwl --directory t
/home/michael/cwltool/env/bin/cwltool 1.0.20160820220956
[job zipper.cwl] /tmp/tmpppA9iT$ zip \
    --recurse-paths \
    - \
    . > /tmp/tmpppA9iT/8cccc91d-7bf6-4008-a727-952ad9752114
  adding: bam (stored 0%)
  adding: three (stored 0%)
  adding: bam.bai (stored 0%)
  adding: 8cccc91d-7bf6-4008-a727-952ad9752114 (deflated 53%)
Final process status is success
{
    "zipped_file": {
        "format": "file:///home/michael/cwltool/application/zip", 
        "checksum": "sha1$4e7aea5191e78a98a66b23cb69652df3a607fbde", 
        "basename": "8cccc91d-7bf6-4008-a727-952ad9752114", 
        "location": "file:///home/michael/cwltool/8cccc91d-7bf6-4008-a727-952ad9752114", 
        "path": "/home/michael/cwltool/8cccc91d-7bf6-4008-a727-952ad9752114", 
        "class": "File", 
        "size": 746
    }
}

Finally sample usage with an input document

$ cat zipper.in.yml 
directory_to_zip:
  class: Directory
  location: t
$ cwltool zipper.cwl zipper.in.yml 
/home/michael/cwltool/env/bin/cwltool 1.0.20160820220956
[job zipper.cwl] /tmp/tmpQqxPjt$ zip \
    --recurse-paths \
    - \
    . > /tmp/tmpQqxPjt/2d0ca78b-6248-46aa-8b1c-0708fd0008a1
  adding: bam (stored 0%)
  adding: 2d0ca78b-6248-46aa-8b1c-0708fd0008a1 (deflated 11%)
  adding: three (stored 0%)
  adding: bam.bai (stored 0%)
Final process status is success
{
    "zipped_file": {
        "format": "file:///home/michael/cwltool/application/zip", 
        "checksum": "sha1$7c896e9247972af38d3836e621cd9aa66c1b3499", 
        "basename": "2d0ca78b-6248-46aa-8b1c-0708fd0008a1", 
        "location": "file:///home/michael/cwltool/2d0ca78b-6248-46aa-8b1c-0708fd0008a1", 
        "path": "/home/michael/cwltool/2d0ca78b-6248-46aa-8b1c-0708fd0008a1", 
        "class": "File", 
        "size": 710
    }
}
ADD COMMENT

Login before adding your answer.

Traffic: 2172 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6