ExpressionTool failing after a scatter: How do I debug the Javascript?
1
1
Entering edit mode
5.7 years ago
alanh ▴ 170

We're writing a workflow where we process a couple of bams separately, then run processes with them together (DNAseq somatic). The workflows scatter the tumor/normal for parallel processing, ending up with "tumor.sorted.bam" and "normal.sorted.bam", then gather them for a realignments step in an array with an output of "tumor.realigned.bam" and "normal.realigned.bam". (BAM index files (*.bai) are created and passed along as secondaryFiles for each of these steps here and below as apppropriate)

These bams need some post-processing (resort, dealing with duplicate reads), and I'm able to scatter those bams and get "tumor.realigned.md.bam" and "normal.realigned.md.bam" back as what I believe to be an array of Files.

In the next step, I try to run and ExpressionTool to convert the array of Files into two File objects with secondaryFiles so I can refer to the tumor and normal bams explictly, but I think something is failing in my JavaScript ExpressionTool.

class: ExpressionTool
# collect_tumor_normal_bams.cwl
cwlVersion: v1.0
inputs:
  bams:
    type: File[]
    secondaryFiles: [".bai"]
outputs:
  tumorBam: File
  normalBam: File
requirements:
  InlineJavascriptRequirement: {}
expression: |
  ${

    var tumor = [];
    var tumorSecondary = [];
    var normal = [];
    var normalSecondary = [];

    for (var filenum in inputs.bams) {
      if (inputs.bams[filenum].basename.match(/tumor[^\/]*\.bam$/i) ) {
        tumor.push (inputs.bams[filenum]);
      }
      if (inputs.bams[filenum].basename.match(/tumor[^\/]*\.bai$/i)) {
        tumorSecondary.push (inputs.bams[filenum]);
      }

      if (inputs.bams[filenum].basename.match(/normal[^\/]*\.bam$/i)) {
        normal.push (inputs.bams[filenum]);
      }
      if (inputs.bams[filenum].basename.match(/normal[^\/]*\.bai$/i)) {
        normalSecondary.push (inputs.bams[filenum]);
      }

    }

    tumor["secondaryFiles"] = tumorSecondary;
    normal["secondaryFiles"] = normalSecondary;

    return {"tumorBam": tumor, "normalBam": normal}
    }

Here is a snippet from the higher-level CWL that is calling the above:

# lots of stuff above here that works
  realign:
     run: commandline/realign.cwl
     in:
       bam_files: stageForRealign/bamFiles
       reference_fasta: referenceFasta
       targets_bed: captureBed
     out:
       [bam_file]

   post_realign_sort_index_md:
     run: commandline/bamSortMarkDups.cwl
     scatter: input_file
     in:
       input_file: realign/bam_file
     out:
       [bam_file]

   # Everything works above here:  I get the expected BAMs and BAIs
   # nothing below here works:  I suspect this is an issue with my ExpressionTool, 
   # but I don't know a good way to debug

   collectTN:
     run: expression/collect_tumor_normal_bams.cwl
     in:
       bams: [post_realign_sort_index_md/bam_file]
     out:
       [tumorBam, normalBam]

   coverage_tumor:
     run: commandline/coverage.cwl
     in: 
       bam_file: collectTN/tumorBam
       bed_file: coverageWindows
       genome_file: bedtoolsGenome
     out:
       [counts_file]

   coverage_normal:
     run: commandline/coverage.cwl
     in: 
       bam_file: collectTN/normalBam
       bed_file: coverageWindows
       genome_file: bedtoolsGenome
     out:
       [counts_file]

   call_somatic_variants: 
     run: commandline/somatic-caller.cwl
     in: 
       tumor_bam_file: collectTN/tumorBam
       normal_bam_file: collectTN/normalBam
       reference_fasta: referenceFasta
       regions_bed: captureBed 
     out: 
      [ somatic_caller_output ]

If it makes a difference, we're using the Arvados CWL runner.

cwl workflow javascript expressiontool • 1.2k views
ADD COMMENT
3
Entering edit mode
5.7 years ago
alanh ▴ 170

Apparently the problem was that I defined var tumor and var normal as arrays.

changing those lines to: be as follows makes it run:

var tumor;
var tumorSecondary = [];
var normal;
var normalSecondary = [];

And instead of doing

tumor.push (inputs.bams[filenum]);

I should do:

tumor = inputs.bams[filenum]

and

normal = inputs.bams[filenum]
ADD COMMENT

Login before adding your answer.

Traffic: 2092 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6