cwl GATK GenotypeGVCFs error in linux due to quotes in filenames
0
0
Entering edit mode
4.0 years ago
cocchi.e89 ▴ 290

Hi noticed that everytime I try to run .cwl scripts that include GATK GenotypeGVCFs the runner encounter an error that is related to how the previous step creates filenames in genomic DB (from GATK GenomicsDBImport called through .cwl too):

Invalid filename: '8$1$146364022' contains illegal characters

and actually investigating the genomic DB directory (GenomicsDBImport output) it actually creates filenames for each chromosome directory within quotes that then raise the error above:

enrico@godzilla:/media/kong/enrico/MCD/cwl-run-DIR$ ls -thal MCD_n15/
total 136K
drwxrwsr-x  3 enrico lab 4.0K Dec  5 10:47  ..
drwx------  4 enrico lab 4.0K Dec  5 10:35 'X$1$155270560'
drwx------ 25 enrico lab 4.0K Dec  5 10:29  .
drwx------  4 enrico lab 4.0K Dec  5 10:29 '22$1$51304566'
drwx------  4 enrico lab 4.0K Dec  5 10:25 '21$1$48129895'
drwx------  4 enrico lab 4.0K Dec  5 10:22 '20$1$63025520'
drwx------  4 enrico lab 4.0K Dec  5 10:17 '19$1$59128983'
drwx------  4 enrico lab 4.0K Dec  5 10:10 '18$1$78077248'
drwx------  4 enrico lab 4.0K Dec  5 10:04 '17$1$81195210'
drwx------  4 enrico lab 4.0K Dec  5 09:56 '16$1$90354753'
drwx------  4 enrico lab 4.0K Dec  5 09:49 '15$1$102531392'
drwx------  4 enrico lab 4.0K Dec  5 09:42 '14$1$107349540'
drwx------  4 enrico lab 4.0K Dec  5 09:34 '13$1$115169878'
drwx------  4 enrico lab 4.0K Dec  5 09:28 '12$1$133851895'
drwx------  4 enrico lab 4.0K Dec  5 09:17 '11$1$135006516'
drwx------  4 enrico lab 4.0K Dec  5 09:06 '10$1$135534747'
drwx------  4 enrico lab 4.0K Dec  5 08:55 '9$1$141213431'
drwx------  4 enrico lab 4.0K Dec  5 08:45 '8$1$146364022'
drwx------  4 enrico lab 4.0K Dec  5 08:35 '7$1$159138663'
drwx------  4 enrico lab 4.0K Dec  5 08:22 '6$1$171115067'
drwx------  4 enrico lab 4.0K Dec  5 08:09 '5$1$180915260'
drwx------  4 enrico lab 4.0K Dec  5 07:56 '4$1$191154276'
drwx------  4 enrico lab 4.0K Dec  5 07:43 '3$1$198022430'
drwx------  4 enrico lab 4.0K Dec  5 07:28 '2$1$243199373'
drwx------  4 enrico lab 4.0K Dec  5 07:09 '1$1$249250621'
-rwx------  1 enrico lab 8.4K Dec  5 06:49  vidmap.json
-rwx------  1 enrico lab  18K Dec  5 06:49  vcfheader.vcf
-rwx------  1 enrico lab 1.4K Dec  5 06:49  callset.json
-rwx------  1 enrico lab    0 Dec  5 06:49  __tiledb_workspace.tdb

This happens every single time I have a GenomicsDBImport output in my Linux Ubuntu 18.04.5

Does anybody worked this around? I know I can call it from GATK outside .cwl but for pipeline purposes I'd like to be able to pass this DB through .cwl too.

Thank you very much in advance for any help! Below my cwl script:

#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
label: gatk GenomicsDBImport on GATK docker images

hints:
  DockerRequirement:
    dockerPull: broadinstitute/gatk:latest
  ResourceRequirement:
    coresMin: $(inputs.GenomicsDBImport_coresMin)
    ramMin: $(inputs.GenomicsDBImport_ramMin)

requirements:
  InlineJavascriptRequirement: {}

baseCommand: gatk
arguments: [ "GenomicsDBImport" ]

inputs:
  - id: interval_list
    type: File
    inputBinding:
      position: 1
      prefix: '-L'
  - id: cohort_name
    type: string
    inputBinding:
      position: 2
      prefix: '--genomicsdb-workspace-path'
  - id: gvcf_files
    type:
      - type: array
        items: File
        inputBinding:
          position: 0
          prefix: '-V'
          separate: true
    secondaryFiles:
      - .tbi

outputs:
  GenomicsDBImport_directory:
    type: Directory
    outputBinding:
      glob: $(inputs.cohort_name)
gatk cwl GenotypeGVCFs • 1.1k views
ADD COMMENT
0
Entering edit mode

Hello, sorry for just now seeing this. Support for CWL and the CWL reference runner has moved from biostars to https://cwl.discourse.group/ ; can you post your question there?

ADD REPLY

Login before adding your answer.

Traffic: 2367 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6