Struggling to use the GATK HaplotypeCaller on my bam files.
1
1
Entering edit mode
5.5 years ago
lovegenetics ▴ 70

Good morning everyone,

I have downloaded the current version on my linux server from https://software.broadinstitute.org/gatk/download/ using the following command:

$ wget "https://github.com/broadinstitute/gatk/releases/download/4.1.2.0/gatk-4.1.2.0.zip"

The software was download and the jar files were made available. The next thing I wanted to do was use the software on my bam files in order to identify the different haplotypes present. I have two samples, and each sample has 5 bam files. I already have other software packages installed to my server like SAMtools, Picard, IGV etc.

However when I ran this command:

$ java -jar gatk-package-4.1.2.0-local.jar -T HaplotypeCaller -R chr86800001-6851705.fasta -I BC01PCR1.sorted.bam -o BC01PCR1_raw_variants.vcf

I get the following error message:

A USER ERROR has occurred: '-T is not a valid command.


***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

Next, I tried this command where I just added the genotype mode option and changed the order of commands:

$ java -jar gatk-package-4.1.2.0-local.jar -R chr86800001-6851705.fasta -T HaplotypeCaller -I BC01PCR1.sorted.bam --genotyping_mode DISCOVERY -o BC01PCR1_rawvariants.vcf

And this is the error message I got:

***********************************************************************

A USER ERROR has occurred: '-R' is not a valid command.


***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

Can someone please tell me what is the problem what I should have done differently or what do I need to download in order to use GATK. I'm really interested in using the HaplotypeCaller option on my files.

gatk haplotypecaller software error • 5.0k views
ADD COMMENT
0
Entering edit mode

Can you see if just gatk is on your $PATH? I believe they now use a wrapper script, and you should not explicitly do java -jar gatk-package-4.1.2.0-local.jar etc etc. If you are following guidelines, make sure they are for the version you are currently using and not for an older version as things change.

ADD REPLY
0
Entering edit mode

hi alhamidi.reem,

Of course you get this error. If you read carefully the documentation you will notice that the correct way to run any tool of gatk4 (let s say HaplotypeCaller) is:

java -jar gatk.jar HaplotypeCaller

-T parameter is deprecated now. It was used in older versions.

In the second case that you run, you put again -T (which is wrong) and you inverted the correct order of parameters. As I can see from your code, you called gatk and then added the refererence. So in that case gatk was trying to find a tool named -R.

Hope this will help you to run it correctly.

Cheers

ADD REPLY
0
Entering edit mode

Thank you all for your responses. I appreciate it.

ADD REPLY
3
Entering edit mode
5.5 years ago

In the GATK folder you should have an executable file named gatk

you should add it to your PATH or create a variable in your script storing the path to this file :

gatk="path/to/the/file/gatk"

genome="path/to/genome/file/" # fasta file used to align the data e.g. hg38 for human
interval="path/to/interval/bed/file"  # required for targeted data e.g. Whole-Exome Sequencing

$gatk HaplotypeCaller \
    --java-options "-Xmx40g -Xms40g" \
    -R $genome \
    -L $interval \
    -O out.vcf \
    -I align.bam \
    -ERC GVCF \
    --max-alternate-alleles 3
ADD COMMENT

Login before adding your answer.

Traffic: 2048 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6