GATK BaseRecalibrator error.. how do I solve this?
1
0
Entering edit mode
3.9 years ago
kwanghoon ▴ 20

I downloaded reference genome and known site vcf at GATK bundle (https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0;tab=objects?pli=1&prefix=&forceOnObjectsSortingFiltering=false)

I sorted and duduplicated bam file then tried to use baserecalibrator. But I got errors like below.

gatk BaseRecalibrator -I sorted_dedup_test.bam -R $HG38 --known-sites resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf -O recal_data.table

Using GATK jar /home/sunghyepark_lab/packages/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar

Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/sunghyepark_lab/packages/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar BaseRecalibrator -I sorted_dedup_test.bam -R /home/sunghyepark_lab/test/test_files/HG38_Broad/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta --known-sites resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf -O recal_data.table

08:01:24.131 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/sunghyepark_lab/packages/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so

Jan 11, 2021 8:01:24 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.

08:01:24.316 INFO  BaseRecalibrator - ------------------------------------------------------------
08:01:24.317 INFO  BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.8.1
08:01:24.317 INFO  BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
08:01:24.317 INFO  BaseRecalibrator - Executing as sunghyepark_lab@6655e26fb574 on Linux v3.10.0-327.3.1.el7_lustre.x86_64 amd64
08:01:24.317 INFO  BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_252-b09
08:01:24.317 INFO  BaseRecalibrator - Start Date/Time: January 11, 2021 8:01:24 AM UTC
08:01:24.318 INFO  BaseRecalibrator - ------------------------------------------------------------
08:01:24.318 INFO  BaseRecalibrator - ------------------------------------------------------------
08:01:24.318 INFO  BaseRecalibrator - HTSJDK Version: 2.23.0
08:01:24.318 INFO  BaseRecalibrator - Picard Version: 2.22.8
08:01:24.318 INFO  BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
08:01:24.318 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
08:01:24.318 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
08:01:24.318 INFO  BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
08:01:24.319 INFO  BaseRecalibrator - Deflater: IntelDeflater
08:01:24.319 INFO  BaseRecalibrator - Inflater: IntelInflater
08:01:24.319 INFO  BaseRecalibrator - GCS max retries/reopens: 20
08:01:24.319 INFO  BaseRecalibrator - Requester pays: disabled
08:01:24.319 INFO  BaseRecalibrator - Initializing engine
08:01:24.933 INFO  FeatureManager - Using codec VCFCodec to read file file:///home2/sunghyepark_lab/storage/Whole_Exome_Sequencing/NMDA_encephal_202012/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf

08:01:24.944 INFO  BaseRecalibrator - Shutting down engine
[January 11, 2021 8:01:24 AM UTC] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2224029696

**org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf**
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:383)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:335)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:282)
        at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:246)
        at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:209)
        at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:156)
        at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:68)
        at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:709)
        at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:50)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)

**Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file, for input source: resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf**
        at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
        at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
        at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
        at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:121)
        at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:380)
        ... 14 more

**Caused by: htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file**
        at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:115)
        at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:79)
        at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:37)
        at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:261)
        ... 18 more
GATK NGS WES • 2.9k views
ADD COMMENT
1
Entering edit mode

what is the output of

file resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf

and

head resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf
ADD REPLY
0
Entering edit mode

Output of file resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf

resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf: gzip compressed data, extra field

And output of head resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf

▒BC&▒}is▒H▒▒▒▒▒▒o:▒%! ▒y!kqiZ▒]▒▒▒%B▒(▒▒Ŷ▒׿▒X2▒▒▒▒꒪▒�p▒&q▒▒]O^$▒▒▒v:+o▒▒▒x▒▒▒▒▒▒▒▒▒w▒▒▒▒Ϧ▒▒▒zZ▒▒▒0▒▒g▒▒▒X?.▒=|▒h:_lօ▒▒▒_▒6Z▒▒▒x(▒▒▒x=v_-▒▒▒▒ޖ▒b5▒▒ro▒▒▒F▒▒j^▒▒C▒wy~q▒\ wZ▒K▒▒▒▒▒▒l▒ ▒▒7▒ͤ<▒▒vY▒}▒▒MY▒ʿo܇▒▒▒▒▒]9▒▒▒▒▒▒t▒*▒▒U)vo▒▒6+▒▒.▒▒Y▒▒▒f+▒▒▒z:/';▒▒;7▒K▒ک▒[▒|▒▒}5▒▒S/▒▒▒▒/;▒▒h▒ѩ▒▒s4▒▒▒▒ˎ▒9▒{bt▒ˎ▒9▒{bt▒ˎ▒9▒{bt▒eG▒▒=1:▒{▒▒]▒▒▒{▒▒▒▒▒
                                                                                                                      ▒▒▒h▒▒t▒▒▒▒x▒▒Y▒{j|/▒<K{O▒C▒gi▒▒p▒▒,▒=5▒\▒▒▒▒▒▒¡˳▒▒▒▒^8xy▒▒▒
                                                                                                                                                                                  ▒/▒▒▒S▒▒
                                                                                                                                                                                          ▒/▒▒▒S▒{▒▒▒Y▒{j|/M▒▒p▒▒▒▒&էy͑Ju▒▒XW▒-4Z▒▒▒▒T▒▒▒▒]▒▒▒▒▒▒▒▒▒f▒x▒(▒▒▒▒x▒~▒▒▒N2Z▒˚▒ߔ▒▒]٬ʷ▒▒▒.▒q3▒M▒▒r▒▒sRގ7▒▒k▒▒▒▒▒b▒0▒▒lo▒Z/▒7▒9,▒▒▒▒▒▒▒▒h▒▒m▒3▒<▒{G▒▒▒o▒▒▒▒ퟞ6▒ݲ▒,▒▒▒m1▒9k)a▒▒▒ˎ▒]▒▒▒▒;▒k▒i▒▒\▒כձ#▒▒o{r▒Q}*<▒\|▒▒▒▒▒I▒▒rR\▒▒Y▒▒▒▒0▒▒g▒▒▒C9_Ð▒|▒▒▒▒▒~▒▒▒▒5▒Ϫ;7▒bV~,g{'▒▒o▒W▒▒.ݗ▒E{▒ґ▒▒O▒▒߭▒▒▒/▒▒▒SެW▒▒▒#▒▒w▒l▒]-▒▒4▒D▒▒&▒Ï▒▒▒▒▒▒▒▒݇)\$▒▒xxc▒▒5ȃ*~<,▒/▒.▒▒w▒▒▒N▒▒h▒▒/W▒W/▒▒G▒f▒W-

Is it because of compressed file or file is crashed?

ADD REPLY
0
Entering edit mode

you should use {.vcf.gz} not {.vcf} from their resource bundle..

ADD REPLY
3
Entering edit mode
3.9 years ago

you downloaded resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf as a VCF file but it looks like a BCF file (???). Fix this with:

bcftools view -O z -o tmp.vcf.gz resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf
mv tmp.vcf.gz resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
bcftools index -t resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
ADD COMMENT
0
Entering edit mode

Wow it worked....

THANK YOU SO SO SO MUCH.

I don't know why resource bundle looks like BCF....

Anyway Thank you again!!!

ADD REPLY

Login before adding your answer.

Traffic: 1717 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6