Question

bedtools sort -faidx in CWL

1

Entering edit mode

6.2 years ago

dami ▴ 10

How does one uses a secondairy file as the input to a command in CWL?

I try to do :

bedtools sort -header -faidx hg38.fasta.fai

So in CWL i got to

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
label: "sort a bed file based on occurrence in fasta"

requirements:
  - class: DockerRequirement
    dockerPull: "biocontainers/bedtools:v2.28.0_cv2"

baseCommand: ["bedtools", "sort", "-header"]

stdout: $(inputs.output_name)

inputs:
  input_bed:
    type: File
    inputBinding:
      position: 2
      prefix: -i

  reference_genome:
    type: File
    secondaryFiles:
       - .fai
    inputBinding:
      position: 1
      prefix: -faidx
      valueFrom: {$(inputs.reference_genome.basename).fai}

  output_name:
    type: string?
    default: sorted_coverage.bed

outputs:
  sorted_bed:
    type: stdout

But that does not work for the ValueFrom method under reference_genome/inputbinding.

Does anyone know how to do this?

CWL bedtools sort • 2.4k views

ADD COMMENT • link updated 5.8 years ago by kaushik.ghose ▴ 80 • written 6.2 years ago by dami ▴ 10

0

Entering edit mode

I have not yet used bedtools, so i'm not sure i understand correctly. Can you spell out the names of the files which bedtools needs to access for the line you have given in the beginning? I guess one is g38.fasta.fai. But what are the names of the other files?

ADD REPLY • link 6.2 years ago by Tom ▴ 540

0

Entering edit mode

You are right, the bed file to be sorted needs to be added to the command at the top with the -i flag so that you get:

bedtools sort -header -faidx ref.fasta.fai -i some.bed

ADD REPLY • link 6.0 years ago by dami ▴ 10

0

Entering edit mode

If the command that you are trying to produce is bedtools sort -header -faidx hg38.fasta.fai it looks like you don't need secondaryFiles at all. You should just be able to pass in hg38.fasta.fai as your input file.

ADD REPLY • link 6.2 years ago by karl.sebby ▴ 100

0

Entering edit mode

Thanks for your response! What you are saying is true, However, if this is part of a bigger pipeline where the .fai is also needed as a fasta index file its repetitive to add it everywhere manually. So I was hoping that there was a way to access the secondary files directly.

ADD REPLY • link 6.0 years ago by dami ▴ 10

score 1 · Answer 1 · 2019-10-16

Hello @dami, there was a typo in your valueFrom field the {} should not be there

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
label: "sort a bed file based on occurrence in fasta"

requirements:
  - class: DockerRequirement
    dockerPull: "biocontainers/bedtools:v2.28.0_cv2"

baseCommand: ["bedtools", "sort", "-header"]

stdout: $(inputs.output_name)

inputs:
  input_bed:
    type: File
    inputBinding:
      position: 2
      prefix: -i

  reference_genome:
    type: File
    secondaryFiles:
       - .fai
    inputBinding:
      position: 1
      prefix: -faidx
      valueFrom: $(inputs.reference_genome.basename).fai

  output_name:
    type: string?
    default: sorted_coverage.bed

outputs:
  sorted_bed:
    type: stdout