CWL toil-cwl-runner for two samples, could not find a error log and reason for failure
0
0
Entering edit mode
5.8 years ago
a.james ▴ 240

Dear All,

I am using CWL toil-cwl-runner for two samples on my CWL workflow. The workflow is running forever, without any clear error logs and error message in the log files.

This is how the main_parell.cwl where all the sample logistics and the parallelising are defined.

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow

requirements:
 - class: ScatterFeatureRequirement
 - class: SubworkflowFeatureRequirement
 - class: InlineJavascriptRequirement

inputs:
 reads1:
  type: File[]
 reads2:
  type: File[]
 sample_names:
  type: string[]
 genomeDir:
  type: Directory
 sjdbGTFfile:
  type: File
 annotation:
  type: File
 bam:
  type: File
 exp_out:
  type: string
 outputfile:
  type: File
 genome:
  type: File

outputs:
 bam_dir:
  type: Directory
  outputSource: collect/bam_dir
 count_dir:
  type: Directory
  outputSource: collect/count_dir
steps:
  pipeline_workflow:
     run: workflow.cwl
     scatter: [reads1, reads2, sample_name]
     scatterMethod: dotproduct
     in:
      reads1: reads1
      reads2: reads2
      sample_name: sample_names
      genomeDir: genomeDir
      sjdbGTFfile: sjdbGTFfile
      annotation: annotation
      bam: bam
      exp_out: exp_out
      genome: genome
      outputfile: outputfile
     out: [alignment_out, expression_out]
  collect:
    in:
      bam_files:
        source: [pipeline_workflow/alignment_out]
        linkMerge: merge_flattened
      count_files:
        source: [pipeline_workflow/expression_out]
        linkMerge: merge_flattened
    out: [bam_dir, count_dir]
    run:
      class: ExpressionTool
      id: "collect_step"
      inputs:
        bam_files: File[]
        count_files: File[]
      outputs:
        bam_dir: Directory
        count_dir: Directory
      expression: |
       ${
        return {
          "bam_dir": {
             "class": "Directory",
             "basename": "bams",
             "listing": inputs.bam_files
          },
          "count_dir": {
             "class": "Directory",
             "basename": "counts",
             "listing": [].concat.apply([], inputs.count_files)
          }
        };
        }

The pipeline workflow.cwl looks as follows,

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow

doc: "Workflow Components: alignment -> count expression -> lib-size"

requirements:
 - class: ScatterFeatureRequirement
 - class: SubworkflowFeatureRequirement
 - class: InlineJavascriptRequirement

inputs:
 reads1:
  type: File
 reads2:
  type: File
 genome:
  type: File
 genomeDir:
  type: Directory
 #outSAMattrRGline: string
 sjdbGTFfile:
  type: File
 annotation:
  type: File
 bam:
  type: File
 exp_out:
  type: string
 sample_name:
  type: string


outputs:
 alignment_out:
  type: File
  outputSource: star/star_bam
 expression_out:
  type: File
  outputSource: expressioncount/expression_out

steps:
  star:
    run: star.cwl
    in:
     genomeDir: genomeDir
     reads1: reads1
     reads2: reads2
     sjdbGTFfile: sjdbGTFfile
     outFileNamePrefix: sample_name
     runThreadN:
      default: 4
     outFilterMultimapScoreRange:
      default: 1
    out: [star_bam]
  expressioncount:
   run: count_expression.cwl
   in:
    annotation: annotation
    bam: star/star_bam
    exp_out: exp_out
   out: [expression_out]

And then, the yml file looks as following,

 reads1:  # array of type "File"
      - class: File
        path:  sample1_R1.fastq.gz
      - class: File
        path:  sample2_R1.fastq.gz
    reads2:  # array of type "File"
      - class: File
        path:  sample1_R2.fastq.gz
      - class: File
        path:  sample2_R2.fastq.gz
    sample_names:  # array of type "File"
      - sample1
      - sample2
    #outSAMattrRGline: ID::M45ZB
    genomeDir:
      class: Directory
      path: hg19_hs37d5.overhang100_STAR
    genome:
     class: File
     path:  /genome.fa
    sjdbGTFfile:
     class: File
     path:  gencode.v28lift37.annotation.gtf
    samples:
     - class: File
       path: sample1.counts
     - class: File
       path: sample2.counts

    annotation:
     class: File
     path: gencode.v19.annotation.hs37d5_chr.gtf
    bam:
     - class: File
       path: sample1.bam
     - class: File
       path: sample2.bam
    exp_out:
      - sample1.counts
      - sample2.counts

The issue is toil cwl runner is running with any proper error logs and failed messages. Any help is appreciated . I ran the toil-cal like following,

toil-cwl-runner --stats --clusterStats --retryCount=0 --batchSystem=lsf --disableCaching --tmpdir-prefix ${TMP_DIR} --tmp-outdir-prefix ${TMP_OUT_DIR} --workDir ${WORK_DIR} --realTimeLogging --cleanWorkDir=never --clean=never --outdir ${OUT_DIR} --logDebug --logFile ${LOG_FILE} --writeLogs --jobStore ${JOB_STORE} main_parell.cwl moun.yml
cwl • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2450 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6