I'm not sure this problem is caused by lacking permissions. But then again i have no experience using GATK. Would you mind sharing the docker container and the command line tool?
Hi Tom! Sorry for the belated response.
Here is the command line tool where you can see the docker container:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand:
- "gatk"
- "MarkDuplicatesSpark"
hints:
DockerRequirement:
dockerPull: broadinstitute/gatk
requirements:
- class: InlineJavascriptRequirement
inputs:
inputFileName_markDups:
type: File
inputBinding:
position: 4
prefix: -I
doc: One or more input SAM or BAM files to analyze. Must be coordinate sorted.
Default value null. This option may be specified 0 or more times
validationStringency:
type: string
default: LENIENT
inputBinding:
position: 23
prefix: -VS
doc: Validation stringency for all SAM/BAM/CRAM/SRA files read by this program. The default stringency value SILENT can improve performance when processing a BAM file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded.
metricsFile:
type: string
default: "metrics.txt"
inputBinding:
position: 6
prefix: -M
doc: File to write duplication metrics to Required
createIndex:
type: string?
default: 'true'
inputBinding:
position: 20
prefix: -OBI
doc: Whether to create a BAM index when writing a coordinate-sorted BAM file.
Default value false. This option can be set to 'null' to clear the default value.
Possible values {true, false}
outputs:
markDups_output:
type: File
outputBinding:
glob: output.dedup.bam
secondaryFiles:
- .bai
arguments:
- position: 10
prefix: '-O'
valueFrom: output.dedup.bam
Hi Daiana! Did the answer below help you? Please feel free to ask if you have further questions.
Coming from a biology background, it takes me a lot of time (and questions in this forum) to figure some of the bioinformatics stuff out. So i'm very sympathetic to anyone needing more elaborate explanations
Hi Tom, thank you very much for your help! Indeed, I have stuggled a lot with bioinformatic stuff !!
I'm sorry for my belated response, but I have yet not been able to make it work... Maybe there is something wrong with my importation of the dockerfile?
Just to be sure, my new code has:
And the gatk-Dockerfile is your dockerfile posted below. Still when I run it, I get the same error... Any ideas on what may be going on? How did you manage to make it work? Would you mind sharing the command and code?
You have to remove the dockerPull: broadinstitute/gatk:4.1.2.0 line from your code. The broadinstitutes docker container is part of the issue, so you don't want to use that one anymore.
I used the Dockerfile i posted below. I usually put my containers on docker hub, but the easiest solution would be to just put the Dockerfile right into the .cwl file. Like so:
I am pretty sure this is just an issue of spark/docker (see stackoverflow) and not related to CWL. I experience the same issue when trying to use the broadinstitute gatk container to run your command line tool.
The thread on stackoverflow provides several solutions. I made a docker container for gatk using the IBM JDK (as was suggested in the thread) and it seems to solve the problem.
I'm not sure this problem is caused by lacking permissions. But then again i have no experience using GATK. Would you mind sharing the docker container and the command line tool?
Hi Tom! Sorry for the belated response. Here is the command line tool where you can see the docker container:
Thanks, Daiana
Hi Daiana! Did the answer below help you? Please feel free to ask if you have further questions.
Coming from a biology background, it takes me a lot of time (and questions in this forum) to figure some of the bioinformatics stuff out. So i'm very sympathetic to anyone needing more elaborate explanations
Regards, Tom
Hi Tom, thank you very much for your help! Indeed, I have stuggled a lot with bioinformatic stuff !! I'm sorry for my belated response, but I have yet not been able to make it work... Maybe there is something wrong with my importation of the dockerfile? Just to be sure, my new code has:
And then my gatk-docker.yml is:
And the gatk-Dockerfile is your dockerfile posted below. Still when I run it, I get the same error... Any ideas on what may be going on? How did you manage to make it work? Would you mind sharing the command and code?
Thank you very much again!
Regards,
Daiana
I just realized that you use the Github file and not the Docker image, maybe that is what I am doing wrong... I'll check and get back to you !
You have to remove the
dockerPull: broadinstitute/gatk:4.1.2.0
line from your code. The broadinstitutes docker container is part of the issue, so you don't want to use that one anymore.I used the Dockerfile i posted below. I usually put my containers on docker hub, but the easiest solution would be to just put the Dockerfile right into the .cwl file. Like so: