Hello everyone,
I'm working on a workflow which will have to deal with a directory containing more than a million files as input for the first step. cwltool (version 1.0.20181217162649, running in a python 3.6.7 venv) works fine if i input a very small amount of test data. As soon as i input a larger set (~60.000 files) it will complain about the large number of files and suggest to add the following to my CommandLineTool:
$namespaces:
cwltool: "http://commonwl.org/cwltool#"
hints:
cwltool:LoadListingRequirement:
loadListing: shallow_listing
However, these sections are already present in my tool description (although the page referenced under namespaces doesn't seem to exist). But cwltool does not seem to know how to interpret them. It will put out the following message:
demultiplexing/demultiplexingToolDeepbinner.cwl:19:3: Unknown hint http://commonwl.org/cwltool#LoadListingRequirement
In most cases, the workflow execution will fail shortly after.
I have found no documentation regarding load listing anywhere. A (test) workflow in the cwltool repository produces the same error.
This is the code of my command line tool:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [deepbinner, realtime]
doc: |
Uses deepbinner to sort raw nanopore reads of barcoded DNA by barcode.
requirements:
InlineJavascriptRequirement: {}
DockerRequirement:
dockerImageId: tmi_deepbinner
InitialWorkDirRequirement:
listing:
- entry: $(inputs.reads_directory)
writable: true
arguments:
- valueFrom: $("demultiplexed")
prefix: --out_dir
position: 2
hints:
cwltool:LoadListingRequirement:
loadListing: no_listing
inputs:
reads_directory:
label: Directory containing raw nanopore reads in .fast5 format
type: Directory
inputBinding:
prefix: --in_dir
position: 1
barcoding_type:
label: Specifies wether native or rapid barcoding was performed
type: string
inputBinding:
prefix: --
separate: false
position: 3
outputs:
barcode_directories:
label: Directories with raw .fast5 data, each pertaining to a specific barcode
type: Directory[]
outputBinding:
glob: $("demultiplexed/barcode*")
unclassified_reads_directory:
label: Directory containing raw .fast5 data that could not be matched to a barcode
type: ["null", Directory]
outputBinding:
glob: demultiplexed/unclassified