Can someone please help me run any GATK4 pipeline?
1
0
Entering edit mode
6.5 years ago
moxu ▴ 510

GATK4 is a great variant calling software product and is dominating in the field. Unfortunately, I am not able to run any of the GATK4 pipelines, probably because these pipelines all use google storage which I am not familiar with? Now I am trying to run the $5 variant calling pipeline because I guess it's the easiest to run.

I downloaded the pipeline using:

git clone https://github.com/gatk-workflows/five-dollar-genome-analysis-pipeline.git

The command line used was:

java -jar cromwell-31.jar run germline_single_sample_workflow.wdl --inputs germline_single_sample_workflow.hg38.inputs.json

Both the .wdl file and the .json file are included in the GitHub package and unchanged when the above command line was run.

The (error) messages I got from the above command line execution are attached at the bottom of this post.

Can someone please tell me what I need to do to make this work?

Thanks much!

--------------------------------- (error) messages snippets -------------------------------------

[2018-05-31 10:08:28,06] [info] Running with database db.url = jdbc:hsqldb:mem:9f3b961e-97d8-4fc4-a30a-7e86c6f14bdc;shutdown=false;hsqldb.tx=mvcc
[2018-05-31 10:08:32,43] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2018-05-31 10:08:32,44] [info] [RenameWorkflowOptionsInMetadata] 100%
[2018-05-31 10:08:32,52] [info] Running with database db.url = jdbc:hsqldb:mem:11019e42-b5ee-4466-bbad-80dbf98f3c00;shutdown=false;hsqldb.tx=mvcc
[2018-05-31 10:08:32,81] [info] Slf4jLogger started
[2018-05-31 10:08:32,98] [info] Metadata summary refreshing every 2 seconds.
[2018-05-31 10:08:33,01] [info] KvWriteActor configured to flush with batch size 200 and process rate 5 seconds.

...

[2018-05-31 10:08:41,71] [[38;5;220mwarn[0m] Local [[38;5;2m6b73056d[0m]: Key/s [memory, disks, preemptible] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-05-31 10:08:41,71] [[38;5;220mwarn[0m] Local [[38;5;2m6b73056d[0m]: Key/s [preemptible, disks, cpu, memory] is/are not supported by backend. Unsupported attributes will not be part of job executions.
[2018-05-31 10:08:43,92] [info] WorkflowExecutionActor-6b73056d-8171-4712-a05a-b8dfcdeb36d6 [[38;5;2m6b73056d[0m]: Starting germline_single_sample_workflow.ScatterIntervalList
[2018-05-31 10:08:44,97] [info] fe6db5b3-d91a-40d4-a35b-cf3c937deaaa-SubWorkflowActor-SubWorkflow-to_bam_workflow:-1:1 [[38;5;2mfe6db5b3[0m]: Starting to_bam_workflow.GetBwaVersion
[2018-05-31 10:08:45,95] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: Unrecognized runtime attribute keys: memory
[2018-05-31 10:08:45,95] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2m6b73056d[0mgermline_single_sample_workflow.ScatterIntervalList:NA:1]: Unrecognized runtime attribute keys: memory
[2018-05-31 10:08:45,98] [[38;5;1merror[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2m6b73056d[0mgermline_single_sample_workflow.ScatterIntervalList:NA:1]: Error attempting to Execute
java.lang.Exception: Failed command instantiation
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:400)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand$(StandardAsyncExecutionActor.scala:340)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.instantiatedCommand$lzycompute(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.instantiatedCommand(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.standard.StandardAsyncExecutionActor.commandScriptContents(StandardAsyncExecutionActor.scala:235)
at cromwell.backend.standard.StandardAsyncExecutionActor.commandScriptContents$(StandardAsyncExecutionActor.scala:234)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.commandScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.writeScriptContents(SharedFileSystemAsyncJobExecutionActor.scala:140)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.writeScriptContents$(SharedFileSystemAsyncJobExecutionActor.scala:139)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.cromwell$backend$sfs$BackgroundAsyncJobExecutionActor$$super$writeScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.BackgroundAsyncJobExecutionActor.writeScriptContents(BackgroundAsyncJobExecutionActor.scala:12)
at cromwell.backend.sfs.BackgroundAsyncJobExecutionActor.writeScriptContents$(BackgroundAsyncJobExecutionActor.scala:11)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.writeScriptContents(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute(SharedFileSystemAsyncJobExecutionActor.scala:123)
at cromwell.backend.sfs.SharedFileSystemAsyncJobExecutionActor.execute$(SharedFileSystemAsyncJobExecutionActor.scala:121)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.execute(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.standard.StandardAsyncExecutionActor.$anonfun$executeAsync$1(StandardAsyncExecutionActor.scala:451)
at scala.util.Try$.apply(Try.scala:209)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync(StandardAsyncExecutionActor.scala:451)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeAsync$(StandardAsyncExecutionActor.scala:451)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.executeAsync(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover(StandardAsyncExecutionActor.scala:744)
at cromwell.backend.standard.StandardAsyncExecutionActor.executeOrRecover$(StandardAsyncExecutionActor.scala:736)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.executeOrRecover(ConfigAsyncJobExecutionActor.scala:191)
at cromwell.backend.async.AsyncBackendJobExecutionActor.$anonfun$robustExecuteOrRecover$1(AsyncBackendJobExecutionActor.scala:65)
at cromwell.core.retry.Retry$.withRetry(Retry.scala:37)
at cromwell.backend.async.AsyncBackendJobExecutionActor.withRetry(AsyncBackendJobExecutionActor.scala:61)
at cromwell.backend.async.AsyncBackendJobExecutionActor.cromwell$backend$async$AsyncBackendJobExecutionActor$$robustExecuteOrRecover(AsyncBackendJobExecutionActor.scala:65)
at cromwell.backend.async.AsyncBackendJobExecutionActor$$anonfun$receive$1.applyOrElse(AsyncBackendJobExecutionActor.scala:88)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
at akka.actor.Actor.aroundReceive(Actor.scala:514)
at akka.actor.Actor.aroundReceive$(Actor.scala:512)
at cromwell.backend.impl.sfs.config.BackgroundConfigAsyncJobExecutionActor.aroundReceive(ConfigAsyncJobExecutionActor.scala:191)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:527)
at akka.actor.ActorCell.invoke(ActorCell.scala:496)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: common.exception.AggregatedMessageException: Error(s):
:
java.lang.IllegalArgumentException: gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
at common.validation.Validation$ValidationTry$.toTry$extension1(Validation.scala:60)
at common.validation.Validation$ValidationTry$.toTry$extension0(Validation.scala:56)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:398)
... 42 common frames omitted
[2018-05-31 10:08:45,99] [info] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: [38;5;5m# not setting set -o pipefail here because /bwa has a rc=1 and we dont want to allow rc=1 to succeed because

the sed may also fail with that error and that is something we actually want to fail on.
/usr/gitc/bwa 2>&1 | \
grep -e '^Version' | \
sed 's/Version: //'[0m
[2018-05-31 10:08:46,02] [info] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.GetBwaVersion:NA:1]: executing: docker run \
--cidfile /Users/moushengxu/softspace/mudroom/gatk/five-dollar-genome-analysis-pipeline/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion/execution/docker_cid \
--rm -i \
\
--entrypoint /bin/bash \
-v /Users/moushengxu/softspace/mudroom/gatk/five-dollar-genome-analysis-pipeline/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion:/cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion \
us.gcr.io/broad-gotc-prod/genomes-in-the-cloud@sha256:7bc64948a0a9f50ea55edb8b30c710943e44bd861c46a229feaf121d345e68ed /cromwell-executions/germline_single_sample_workflow/6b73056d-8171-4712-a05a-b8dfcdeb36d6/call-to_bam_workflow/to_bam_workflow/fe6db5b3-d91a-40d4-a35b-cf3c937deaaa/call-GetBwaVersion/execution/script
[2018-05-31 10:08:46,10] [info] fe6db5b3-d91a-40d4-a35b-cf3c937deaaa-SubWorkflowActor-SubWorkflow-to_bam_workflow:-1:1 [[38;5;2mfe6db5b3[0m]: Starting to_bam_workflow.CreateSequenceGroupingTSV
[2018-05-31 10:08:47,08] [[38;5;220mwarn[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.CreateSequenceGroupingTSV:NA:1]: Unrecognized runtime attribute keys: preemptible, memory
[2018-05-31 10:08:47,08] [[38;5;1merror[0m] BackgroundConfigAsyncJobExecutionActor [[38;5;2mfe6db5b3[0mto_bam_workflow.CreateSequenceGroupingTSV:NA:1]: Error attempting to Execute
java.lang.Exception: Failed command instantiation
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:400)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand$(StandardAsyncExecutionActor.sca

...


java.lang.IllegalArgumentException: gs://broad-references/hg38/v0/Homo_sapiens_assembly38.dict exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
gs://broad-references/hg38/v0/Homo_sapiens_assembly38.dict exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems
at common.validation.Validation$ValidationTry$.toTry$extension1(Validation.scala:60)
at common.validation.Validation$ValidationTry$.toTry$extension0(Validation.scala:56)
at cromwell.backend.standard.StandardAsyncExecutionActor.instantiatedCommand(StandardAsyncExecutionActor.scala:398)
... 35 common frames omitted
[2018-05-31 10:12:05,44] [info] Automatic shutdown of the async connection
[2018-05-31 10:12:05,44] [info] Gracefully shutdown sentry threads.
[2018-05-31 10:12:05,44] [info] Starting coordinated shutdown from JVM shutdown hook

...

[2018-05-31 10:12:46,84] [info] WorkflowExecutionActor-6b73056d-8171-4712-a05a-b8dfcdeb36d6 [[38;5;2m6b73056d[0m]: WorkflowExecutionActor [[38;5;2m6b73056d[0m] aborted: SubWorkflow-to_bam_workflow:-1:1
[2018-05-31 10:12:47,72] [info] WorkflowManagerActor All workflows are aborted
[2018-05-31 10:12:47,72] [info] WorkflowManagerActor All workflows finished
[2018-05-31 10:12:47,72] [info] WorkflowManagerActor stopped
[2018-05-31 10:12:47,72] [info] Connection pools shut down
[2018-05-31 10:12:47,72] [info] Shutting down SubWorkflowStoreActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down JobStoreActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down CallCacheWriteActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] SubWorkflowStoreActor stopped
[2018-05-31 10:12:47,72] [info] Shutting down ServiceRegistryActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down DockerHashActor - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] Shutting down IoProxy - Timeout = 1800000 milliseconds
[2018-05-31 10:12:47,72] [info] CallCacheWriteActor Shutting down: 0 queued messages to process
[2018-05-31 10:12:47,72] [info] CallCacheWriteActor stopped
[2018-05-31 10:12:47,72] [info] JobStoreActor stopped
[2018-05-31 10:12:47,72] [info] KvWriteActor Shutting down: 0 queued messages to process
[2018-05-31 10:12:47,72] [info] DockerHashActor stopped
[2018-05-31 10:12:47,72] [info] WriteMetadataActor Shutting down: 37 queued messages to process
[2018-05-31 10:12:47,72] [info] IoProxy stopped
[2018-05-31 10:12:47,73] [info] WriteMetadataActor Shutting down: processing 0 queued messages
[2018-05-31 10:12:47,73] [info] ServiceRegistryActor stopped
[2018-05-31 10:12:47,74] [info] Database closed
[2018-05-31 10:12:47,74] [info] Stream materializer shut down
[2018-05-31 10:12:47,74] [info] Message [cromwell.core.actor.StreamActorHelper$StreamFailed] without sender to Actor[akka://cromwell-system/deadLetters] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
software error sequencing next-gen snp • 4.7k views
ADD COMMENT
0
Entering edit mode

What have you tried?

ADD REPLY
0
Entering edit mode
java -jar cromwell-31.jar run germline_single_sample_workflow.wdl --inputs germline_single_sample_workflow.hg38.inputs.json
ADD REPLY
0
Entering edit mode

What have you tried to solve the problem is what YaGalbi meant, not "What was the command you used?"

ADD REPLY
0
Entering edit mode

gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems gs://broad-references/hg38/v0/wgs_calling_regions.hg38.interval_list exists on a filesystem not supported by this instance of Cromwell. Supported filesystems are: MacOSXFileSystem. Please refer to the documentation for more information on how to configure filesystems: http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems at

ADD REPLY
0
Entering edit mode

I noticed that. I ran it on my MacBook Pro (MacOS 10.13.4), which should be the supported filesystem "MacOSXFileSystem" Do I have to do anything as described at http://cromwell.readthedocs.io/en/develop/backends/HPC/#filesystems?

ADD REPLY
3
Entering edit mode
6.5 years ago
vdauwera ★ 1.2k

Hey everyone, this is a bit of a rabbit hole, let's take a step back. If you're going to be running the pipelines on a platform other than Google Cloud you need to use a (slightly) different pipeline script (for computational efficiency) and manage data access differently. I can follow up here: https://gatkforums.broadinstitute.org/wdl/discussion/12111/can-someone-help-me-run-any-gatk4-pipeline

ADD COMMENT
0
Entering edit mode

Was this ever followed up? I can't find the link - but was hoping to understand how to pass local directory to runtime {docker:, disks: } while running WDL pipeline on MacOS. I provided absolute path to the "disks" , but get "Unrecognized runtime attribute keys: disks " Not sure if I should open another question or its okay to continue here.

ADD REPLY

Login before adding your answer.

Traffic: 1698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6