Hi, I have assembled a genome as reference with 36 fragments in the file (a file called contigs.fa)
NODE_2_length_146463_cov_55.024967
NODE_3_length_259_cov_163.339767
NODE_4_length_3636_cov_339.770905
NODE_5_length_78387_cov_47.698547
NODE_6_length_9697_cov_53.580593
NODE_7_length_27613_cov_55.561802
NODE_8_length_60671_cov_49.392410
.............. Total 36 fragments
I would like to consolidate GVCFs using GenomicsDBImport
../00.bin/gatk-4.2.0.0/gatk --java-options "-Xmx1g -Xms1g -DGATK_STACKTRACE_ON_USER_EXCEPTION=true" GenomicsDBImport --sample-name-map raw_vcf_list.txt --genomicsdb-workspace-path rawAssignment3.GDBI --intervals ../01.data/contigs.fa
raw_vcf_list.txt is the sample name list generated beforehand. However, error message pop out.
A USER ERROR has occurred: Couldn't read file ../01.data/contigs.fa. Error was: The file ../01.data/contigs.fa exists, but does not contain Features (ie., is not in a supported Feature file format such as vcf, bcf, bed, or interval_list), and does not have one of the supported interval file extensions ([.list, .intervals]). Please rename your file with the appropriate extension. If ../01.data/contigs.fa is NOT supposed to be a file, please move or rename the file at location /home/sandra/Downloads/Resequencing_Assigment/03.variationCalling/../01.data/contigs.fa
org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file ../01.data/contigs.fa. Error was: The file ../01.data/contigs.fa exists, but does not contain Features (ie., is not in a supported Feature file format such as vcf, bcf, bed, or interval_list), and does not have one of the supported interval file extensions ([.list, .intervals]). Please rename your file with the appropriate extension. If ../01.data/contigs.fa is NOT supposed to be a file, please move or rename the file at location /home/sandra/Downloads/Resequencing_Assigment/03.variationCalling/../01.data/contigs.fa
May I know how to modified the script so that I can using GenomicsDBImport to consolidate GVCFs and carry on the analysis for joint call cohort? Thanks in advance.
in GATK the option
-L
is designed to define an interval (a bed file, a interval_file, etc..) not , as far as I can see, a list of of "fragments" Furthermore, the suffix.fa
should be reserved for FASTA files.thank you