Question

GATK 4 and Spark multithreading

0

Entering edit mode

3.7 years ago

Vic ▴ 110

I would like to how to use Spark within GATK for multi-threading analysis. Unfortunately, the Broad Institute website for its cluster-Spark tutorial documentation is still in progress. I am using HaplotypeCaller which has been working fine but now I have some pooled seq samples and they take much longer so would like to spread the workload. This is an example of my usage:

gatk HaplotypeCaller -I my_pooled_sample.bam -I another_pooled_sample.bam -L a_chromosome -R ref_genome.fna -O my_out_file.g.vcf -ploidy 10 -- --spark-master local[2]

I used the above sparks command from this example. But it didn't work. I checked the help info and got this:

>     gatk forwards commands to GATK and adds some sugar for submitting spark jobs
>      --spark-runner <target>    controls how spark tools are run
>          valid targets are:
>          LOCAL:      run using the in-memory spark runner
>          SPARK:      run using spark-submit on an existing cluster
>                      --spark-master must be specified
>                      --spark-submit-command may be specified to control the Spark submit command
>                      arguments to spark-submit may optionally be specified after --
>          GCS:        run using Google cloud dataproc
>                      commands after the -- will be passed to dataproc
>                      --cluster <your-cluster> must be specified after the --
>                      spark properties and some common spark-submit parameters will be translated
>                      to dataproc equivalents

I then tried using:

--spark-runner local[2]

Which also didn't work. I would appreciate some guidance. Many thanks.

multithreading haplotypecaller sparks gatk • 2.7k views

ADD COMMENT • link updated 2.1 years ago by samuel ▴ 260 • written 3.7 years ago by Vic ▴ 110

0

Entering edit mode

cross posted https://stackoverflow.com/questions/67074318

ADD REPLY • link 3.7 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

I am sorry, I didn't realise that wasn't allowed, I have deleted the other post.