Entering edit mode
4.1 years ago
pablo
▴
310
Hello,
I'm trying to run gatk FastaAlternateReferenceMaker
to get the FASTA files from my vcf file.
I run : gatk FastaAlternateReferenceMaker -R my_reference.fa -O my_output.fasta -V my_file.vcf
I get that error message :
Using GATK jar /opt/apps/gcc-8.1.0/gatk-4.1.0.0/gatk-package-4.1.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /opt/apps/gcc-8.1.0/gatk-4.1.0.0/gatk-package-4.1.0.0-local.jar FastaAlternateReferenceMaker -R my_reference.fa -O my_output.fasta -V my_file.vcf 14:15:23.712 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/apps/gcc-8.1.0/gatk-4.1.0.0/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
14:15:30.426 INFO FastaAlternateReferenceMaker - ------------------------------------------------------------
14:15:30.427 INFO FastaAlternateReferenceMaker - The Genome Analysis Toolkit (GATK) v4.1.0.0
14:15:30.427 INFO FastaAlternateReferenceMaker - For support and documentation go to https://software.broadinstitute.org/gatk/
14:15:30.430 INFO FastaAlternateReferenceMaker - Executing as *** on Linux v3.10.0-1127.18.2.el7.x86_64 amd64
14:15:30.430 INFO FastaAlternateReferenceMaker - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_45-b14
14:15:30.430 INFO FastaAlternateReferenceMaker - Start Date/Time: November 3, 2020 2:15:23 PM CET
14:15:30.430 INFO FastaAlternateReferenceMaker - ------------------------------------------------------------
14:15:30.430 INFO FastaAlternateReferenceMaker - ------------------------------------------------------------
14:15:30.431 INFO FastaAlternateReferenceMaker - HTSJDK Version: 2.18.2
14:15:30.432 INFO FastaAlternateReferenceMaker - Picard Version: 2.18.25
14:15:30.432 INFO FastaAlternateReferenceMaker - HTSJDK Defaults.COMPRESSION_LEVEL : 2
14:15:30.432 INFO FastaAlternateReferenceMaker - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
14:15:30.432 INFO FastaAlternateReferenceMaker - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
14:15:30.432 INFO FastaAlternateReferenceMaker - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
14:15:30.432 INFO FastaAlternateReferenceMaker - Deflater: IntelDeflater
14:15:30.432 INFO FastaAlternateReferenceMaker - Inflater: IntelInflater
14:15:30.433 INFO FastaAlternateReferenceMaker - GCS max retries/reopens: 20
14:15:30.433 INFO FastaAlternateReferenceMaker - Requester pays: disabled
14:15:30.433 INFO FastaAlternateReferenceMaker - Initializing engine
14:15:30.853 INFO FeatureManager - Using codec VCFCodec to read file file:///my_file.vcf
14:15:30.859 INFO FastaAlternateReferenceMaker - Shutting down engine
[November 3, 2020 2:15:30 PM CET] org.broadinstitute.hellbender.tools.walkers.fasta.FastaAlternateReferenceMaker done. Elapsed time: 0.12 minutes.
Runtime.totalMemory()=623378432
org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path my_file.vcf
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:353)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:305)
at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:256)
at org.broadinstitute.hellbender.engine.FeatureManager.addToFeatureSources(FeatureManager.java:234)
at org.broadinstitute.hellbender.engine.FeatureManager.initializeFeatureSources(FeatureManager.java:208)
at org.broadinstitute.hellbender.engine.FeatureManager.<init>(FeatureManager.java:155)
at org.broadinstitute.hellbender.engine.GATKTool.initializeFeatures(GATKTool.java:417)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:638)
at org.broadinstitute.hellbender.engine.ReferenceWalker.onStartup(ReferenceWalker.java:36)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:136)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: Your input file has a malformed header: The FORMAT field was provided but there is no genotype/sample data, for input source: my_file.vcf
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:102)
at htsjdk.tribble.TribbleIndexedFeatureReader.<init>(TribbleIndexedFeatureReader.java:127)
at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:120)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:350)
... 14 more
Caused by: htsjdk.tribble.TribbleException$InvalidHeader: Your input file has a malformed header: The FORMAT field was provided but there is no genotype/sample data
at htsjdk.variant.vcf.AbstractVCFCodec.parseHeaderFromLines(AbstractVCFCodec.java:185)
at htsjdk.variant.vcf.VCFCodec.readActualHeader(VCFCodec.java:111)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:79)
at htsjdk.tribble.AsciiFeatureCodec.readHeader(AsciiFeatureCodec.java:37)
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:261)
... 18 more
I show the header of my vcf file, which looks good :
##fileformat=VCFv4.2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
Super-Scaffold_100001 672149 . C T . PASS DP=210;AF=0.51 GT:HQ
Super-Scaffold_100001 862122 . A T . PASS DP=305;AF=0.5 GT:HQ
Super-Scaffold_100001 931168 . C A . PASS DP=127;AF=0.5 GT:HQ
Super-Scaffold_100001 967240 . C T . PASS DP=127;AF=0.5 GT:HQ
Any idea why I get Your input file has a malformed header: The FORMAT field was provided but there is no genotype/sample data
?
Bests
I think the main problem is here:
Is that vcf file in the right location?
Yes, the vcf file is in the right location. Even if I use the full path, it does not work.
The FORMAT field was provided but there is no genotype/sample data, for input source: my_file.vcf
there is a FORMAT column but there is no associated genotype in your vcf.