I downloaded reference genome and known site vcf at GATK bundle (https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0;tab=objects?pli=1&prefix=&forceOnObjectsSortingFiltering=false)
I sorted and duduplicated bam file then tried to use baserecalibrator. But I got errors like below.
gatk BaseRecalibrator -I sorted_dedup_test.bam -R $HG38 --known-sites resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf -O recal_data.table
Using GATK jar /home/sunghyepark_lab/packages/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/sunghyepark_lab/packages/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar BaseRecalibrator -I sorted_dedup_test.bam -R /home/sunghyepark_lab/test/test_files/HG38_Broad/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta --known-sites resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf -O recal_data.table
08:01:24.131 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/sunghyepark_lab/packages/gatk-4.1.8.1/gatk-package-4.1.8.1-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jan 11, 2021 8:01:24 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
08:01:24.316 INFO BaseRecalibrator - ------------------------------------------------------------
08:01:24.317 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.8.1
08:01:24.317 INFO BaseRecalibrator - For support and documentation go to https://software.broadinstitute.org/gatk/
08:01:24.317 INFO BaseRecalibrator - Executing as sunghyepark_lab@6655e26fb574 on Linux v3.10.0-327.3.1.el7_lustre.x86_64 amd64
08:01:24.317 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_252-b09
08:01:24.317 INFO BaseRecalibrator - Start Date/Time: January 11, 2021 8:01:24 AM UTC
08:01:24.318 INFO BaseRecalibrator - ------------------------------------------------------------
08:01:24.318 INFO BaseRecalibrator - ------------------------------------------------------------
08:01:24.318 INFO BaseRecalibrator - HTSJDK Version: 2.23.0
08:01:24.318 INFO BaseRecalibrator - Picard Version: 2.22.8
08:01:24.318 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
08:01:24.318 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
08:01:24.318 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
08:01:24.318 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
08:01:24.319 INFO BaseRecalibrator - Deflater: IntelDeflater
08:01:24.319 INFO BaseRecalibrator - Inflater: IntelInflater
08:01:24.319 INFO BaseRecalibrator - GCS max retries/reopens: 20
08:01:24.319 INFO BaseRecalibrator - Requester pays: disabled
08:01:24.319 INFO BaseRecalibrator - Initializing engine
08:01:24.933 INFO FeatureManager - Using codec VCFCodec to read file file:///home2/sunghyepark_lab/storage/Whole_Exome_Sequencing/NMDA_encephal_202012/resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf
08:01:24.944 INFO BaseRecalibrator - Shutting down engine
[January 11, 2021 8:01:24 AM UTC] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2224029696
**org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf**
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:383)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:335)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:282)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:246)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:209)
at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:156)
at org.broadinstitute.hellbender.engine.ReadWalker.initializeFeatures(ReadWalker.java:68)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:709)
at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:50)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
**Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file, for input source: resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf**
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:121)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:380)
... 14 more
**Caused by: htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: We never saw the required CHROM header line (starting with one #) for the input VCF file**
at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:115)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:79)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:37)
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:261)
... 18 more
what is the output of
and
Output of
file resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf
And output of
head resources_broad_hg38_v0_Mills_and_1000G_gold_standard.indels.hg38.vcf
Is it because of compressed file or file is crashed?
you should use {.vcf.gz} not {.vcf} from their resource bundle..