CWL with R returns empty output directory with "Final process status is success" status
1
1
Entering edit mode
5.8 years ago
a.james ▴ 240

Dear All,

I have a CWL script which says it is Final process status is success, but none of the output files are written to the described output directory. The CWL scripts has two input parameter one is input directory where all input file for the HIF_scorefinder.R is defined and second is a designated directory, where all output files are suppose to be written into.

But currently, my script just runs and finishes with final status Final process status is success, but inside the output directory there is no expected files. Here is what I have the findscore.cwl

cwlVersion: v1.0
class: CommandLineTool

doc: "The tool for calculating the score for a given set of input samples"

requirements:
 - class: ShellCommandRequirement
 - class: InlineJavascriptRequirement
 - class: InitialWorkDirRequirement
   listing:
    - entry: "$({class: 'Directory', listing: []})"
      entryname: $(inputs.outputdir)
      writable: true

baseCommand: [Rscript, HIF_scorefinder.R]

inputs:
 sample_input:
  type: Directory
  inputBinding:
   position: 1
 outputdir:
  type: string?
  inputBinding:
   position: 2

outputs:
 Hypoxiaresult:
  type: Directory
  outputBinding:
   glob: $(inputs.outputdir)

The YML file is,

sample_input:
 class: Directory
 path: /cluster/home/user/Projects/Kjo_proct/pocKj
outputdir: HypScore1

And I run the cwltool with env preservation. cwltool --debug --preserve-entire-environment hypoxia.cwl moun.yml. I am using

CWL tool version cwltool v1.0 201810

The problem is I need a bunch of files to be written to my final output directory. But it leaves an empty directory. I am not able to figure what is wrong here.

Any help would be much appreciated!

CWL RNA-Seq • 2.4k views
ADD COMMENT
1
Entering edit mode

For those of you who would have similar issues.

So, I have got the results now by making some minor changes. I changed the glob part from $(inputs.outputdir) to runtime.outdir. And so now the CWL tool is now collecting the output files, where it is been executed.

ADD REPLY
1
Entering edit mode
5.8 years ago

Hello @a.james, thank you for your question. Are you sure that your tool is outputting to the directory? I made a self contained version that does seem to work:

cwlVersion: v1.0
class: CommandLineTool

requirements:
 InlineJavascriptRequirement: {}
 InitialWorkDirRequirement:
   listing:
    - entry: "$({class: 'Directory', listing: []})"
      entryname: $(inputs.outputdir)
      writable: true

baseCommand: touch

arguments:
 - $(inputs.outputdir)/foo

inputs:
 sample_input:
  type: Directory
  inputBinding:
   position: 1
 outputdir:
  type: string?
  inputBinding:
   position: 2

outputs:
 Hypoxiaresult:
  type: Directory
  outputBinding:
   glob: $(inputs.outputdir)

And the result

$ cwltool 823.cwl --sample_input test823 --outputdir bar
/home/michael/cwltool/env3/bin/cwltool 1.0.20181217162649
Resolved '823.cwl' to 'file:///home/michael/cwltool/823.cwl'
[job 823.cwl] /tmp/tpwrt3i7$ touch \
    bar/foo \
    /tmp/tmpirj1ylx7/stgff33e089-ef95-4222-9bc8-dfa23dbde82c/test823 \
    bar
Could not collect memory usage, job ended before monitoring began.
[job 823.cwl] completed success
{
    "Hypoxiaresult": {
        "location": "file:///home/michael/cwltool/bar",
        "basename": "bar",
        "class": "Directory",
        "listing": [
            {
                "class": "File",
                "location": "file:///home/michael/cwltool/bar/foo",
                "basename": "foo",
                "checksum": "sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709",
                "size": 0,
                "path": "/home/michael/cwltool/bar/foo"
            }
        ],
        "path": "/home/michael/cwltool/bar"
    }
}
Final process status is success
ADD COMMENT
0
Entering edit mode

@Michael

Thanks for the reply, The R script independly runs a commandlinetool tool without any errors. But even with new addons from your solution. It is finished successfully without arguments. But the output directory is empty.

But when I run with arguments it is throwing errors.

Error in if (input == "" || length(grep("\\n|\\r", input))) { : 
  missing value where TRUE/FALSE needed
Calls: read_in_new_patients -> make_patients_table -> data.frame -> fread
Execution halted
[job hypoxia.cwl] completed permanentFail
{
    "Hypoxiaresult": {
        "location": "file:///cluster/home/user/Projects/legacy_new//HypScore1",
        "basename": "HypScore1",
        "class": "Directory",
        "listing": [],
        "path": "/cluster/home/user/Projects/legacy_new/HypScore1"
    }
}
Final process status is permanentFail
ADD REPLY
1
Entering edit mode

Can you give an example of what you type into your command line to (successfully) run the R-Script? Where exactly do the results turn up if you do it like that? The arguments field in Mr. Crusoes Tool is only necessary for his example using touch, i don't think it was meant as a suggestion to add anything to your code.

ADD REPLY
0
Entering edit mode

@Tom, the command line tool for my RScript look like : Rscript HIF_scorefinder.R /cluster/home/user/Projects/Kjo_proct/pocKj /cluster/home/user/Projects/Kjo_proct/HypScore1. This successfully gives out all I need as output. The output files or the results stores up in this /cluster/home/user/Projects/Kjo_proct/HypScore1 directory and the input files are located in /cluster/home/user/Projects/Kjo_proct/pocKj. The CWL script runs successfully, but returns no output files.

ADD REPLY
0
Entering edit mode

Do you get output if you just put "glob: ."?

ADD REPLY
1
Entering edit mode

Yes, so I need to update the post. Now I have outputs in CWL, by changing the initial directory requirement into,

requirements:
 InlineJavascriptRequirement: {}
 InitialWorkDirRequirement:
  listing: [ $(inputs.outputdir) ]

And , by changing the type of outputdir into directory. So as the output directory's type to Directory. So the glob part looks like this now,

outputs:
 Hypoxiaresult:
  type: Directory
  outputBinding:
   glob: $(runtime.outdir).

Here, the glob collects all the results into the designated output directory from the runtime.outdoor

But the problem here is, in the output directory it creates one more symbolic link to the output directory. I do not know how to get rid of it. Apart from that, the script stores up now the output files

ADD REPLY

Login before adding your answer.

Traffic: 2332 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6